scripts: add cherry-pick verification tool with fuzzy matching #10034

bhandras · 2025-07-04T09:44:32Z

This script compares a release branch against a source branch (e.g. master) to verify that all cherry-picked commits are unmodified. It first attempts fast matching using normalized patch hashes.

If no exact match is found, it falls back to a fuzzy matching mechanism:

Filters source commits by matching author and commit subject
Compares normalized diffs using diff -u
Selects the closest match based on line difference count

Useful for verifying cherry-picks or rebased commits during release processes. Supports scan and compare limits for performance.

To test I ran it on the rc branch (#9986):

./scripts/fuzzy-match-release-branch.sh --source master --release 0-19-2-branch-rc1 --limit 92

gemini-code-assist

Summary of Changes

Hello @bhandras, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new command-line tool to automate the verification of cherry-picked commits during release processes. Its primary purpose is to ensure that commits brought from a source branch into a release branch are identical or very similar, helping to maintain release quality and identify unintended modifications or discrepancies efficiently.

Highlights

New Cherry-Pick Verification Tool: Introduces a new bash script, scripts/fuzzy-match-release-branch.sh, designed to verify the integrity and presence of cherry-picked commits between a source branch (e.g., master) and a release branch.
Dual-Phase Matching Logic: The script employs a two-step verification process: it first attempts an exact match using normalized patch hashes. If no exact match is found, it falls back to a fuzzy matching algorithm that filters source commits by author and subject, then compares normalized diffs to find the closest match based on line difference count.
Configurable Performance and Usability: The tool supports command-line arguments to specify the source and release branches, as well as limits for the number of commits to scan in the source branch and compare in the release branch, enhancing performance for large repositories. It provides clear, detailed output for matched and unmatched commits, including suggestions for manual git diff for fuzzy matches.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a very useful script for verifying cherry-picks during a release process. The implementation is solid, using both exact and fuzzy matching. I've provided feedback on a few critical correctness and high-priority performance issues. Specifically, the argument parsing and fuzzy-match scoring have bugs that need to be addressed. Additionally, there are several opportunities to optimize performance by reducing the number of git calls within loops. Addressing these points will make the script more robust and efficient.

gemini-code-assist · 2025-07-04T09:47:41Z

scripts/fuzzy-match-release-branch.sh

+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --source) SRC_BRANCH="$2"; shift 2 ;;
+    --release) RELEASE_BRANCH="$2"; shift 2 ;;
+    --scan-limit) SRC_SCAN_LIMIT="$2"; shift 2 ;;
+    --limit) RELEASE_LIMIT="$2"; shift 2 ;;
+    -h|--help) show_help ;;
+    *) echo "Unknown argument: $1"; show_help ;;
+  esac
+done


The current argument parsing logic is vulnerable to errors when an option is passed without a corresponding value. For example, if the script is run as ./script.sh --source --limit 10, the string "--limit" will be incorrectly assigned to SRC_BRANCH.

To make the parsing more robust, you should validate that a value is provided for each option and that the value is not another option flag.

Suggested change

while [[ $# -gt 0 ]]; do

case "$1" in

--source) SRC_BRANCH="$2"; shift 2 ;;

--release) RELEASE_BRANCH="$2"; shift 2 ;;

--scan-limit) SRC_SCAN_LIMIT="$2"; shift 2 ;;

--limit) RELEASE_LIMIT="$2"; shift 2 ;;

-h|--help) show_help ;;

*) echo "Unknown argument: $1"; show_help ;;

esac

done

while [[ $# -gt 0 ]]; do

case "$1" in

--source|--release|--scan-limit|--limit)

if [[ -z "$2" || "$2" =~ ^- ]]; then

echo "Error: Missing value for argument $1" >&2

show_help

fi

case "$1" in

--source) SRC_BRANCH="$2" ;;

--release) RELEASE_BRANCH="$2" ;;

--scan-limit) SRC_SCAN_LIMIT="$2" ;;

--limit) RELEASE_LIMIT="$2" ;;

esac

shift 2

;;

-h|--help) show_help ;;

*) echo "Unknown argument: $1"; show_help ;;

esac

done

Thank you, replaced!

gemini-code-assist · 2025-07-04T09:47:41Z

scripts/fuzzy-match-release-branch.sh

+  for j in "${!SRC_COMMIT_META[@]}"; do
+    if [[ "${SRC_COMMIT_META[$j]}" == "$meta_key" ]]; then
+      diff=$(diff -u <(echo "$rc_patch") <(echo "${SRC_PATCHES[$j]}") || true)
+      score=$(echo "$diff" | grep '^[-+]' | wc -l | xargs)


The current method for calculating the diff score incorrectly includes diff header lines (e.g., --- a/file.txt and +++ b/file.txt) in its count because they also start with + or -. This inflates the score and can lead to an inaccurate fuzzy match.

The calculation should be adjusted to count only the lines representing actual content changes.

Suggested change

score=$(echo "$diff" | grep '^[-+]' | wc -l | xargs)

score=$(echo "$diff" | grep -vE '^(--- |\+\+\+ )' | grep -c '^[-+]')

gemini-code-assist · 2025-07-04T09:47:41Z

scripts/fuzzy-match-release-branch.sh

+  author=$(git log -1 --pretty=format:"%an <%ae>" "$commit")
+  subject=$(git log -1 --pretty=format:"%s" "$commit")


The script makes two separate git log calls for each source commit to get the author and subject. This is inefficient and can significantly slow down the script when scanning a large number of commits.

These calls can be combined into a single git log command using a custom format string with a null character separator for safe parsing. This optimization should also be applied to lines 95-96 where release commit info is fetched.

Suggested change

author=$(git log -1 --pretty=format:"%an <%ae>" "$commit")

subject=$(git log -1 --pretty=format:"%s" "$commit")

IFS=$'\0' read -r author subject < <(git log -1 --pretty=format:"%an <%ae>%x00%s" "$commit")

This broke the script, kept the original solution.

gemini-code-assist · 2025-07-04T09:47:41Z

scripts/fuzzy-match-release-branch.sh

+  rc_author=$(git log -1 --pretty=format:"%an <%ae>" "$rc_commit")
+  rc_subject=$(git log -1 --pretty=format:"%s" "$rc_commit")


Similar to the source commit indexing loop, two separate git log calls are made here for each release commit. This is inefficient.

Combining these into a single git log call will improve performance.

Suggested change

rc_author=$(git log -1 --pretty=format:"%an <%ae>" "$rc_commit")

rc_subject=$(git log -1 --pretty=format:"%s" "$rc_commit")

IFS=$'\0' read -r rc_author rc_subject < <(git log -1 --pretty=format:"%an <%ae>%x00%s" "$rc_commit")

This broke the script, kept the original solution. Efficiency is not the key here.

gemini-code-assist · 2025-07-04T09:47:41Z

scripts/fuzzy-match-release-branch.sh

+  found_exact=""
+  for j in "${!SRC_PATCH_HASHES[@]}"; do
+    if [[ "${SRC_PATCH_HASHES[$j]}" == "$rc_patch_hash" ]]; then
+      found_exact="${SRC_COMMITS_ARRAY[$j]}"
+      break
+    fi
+  done
+
+  if [[ -n "$found_exact" ]]; then
+    src_subject=$(git log -1 --pretty=format:"%s" "$found_exact")
+    src_author=$(git log -1 --pretty=format:"%an <%ae>" "$found_exact")


When an exact patch match is found, the script makes new git log calls to get the source commit's subject and author. This is redundant and inefficient because this information was already collected and stored in SRC_COMMIT_META during the indexing phase.

You can improve performance by refactoring this section to retrieve the metadata from the pre-populated arrays instead of calling git again.

Suggested change

found_exact=""

for j in "${!SRC_PATCH_HASHES[@]}"; do

if [[ "${SRC_PATCH_HASHES[$j]}" == "$rc_patch_hash" ]]; then

found_exact="${SRC_COMMITS_ARRAY[$j]}"

break

fi

done

if [[ -n "$found_exact" ]]; then

src_subject=$(git log -1 --pretty=format:"%s" "$found_exact")

src_author=$(git log -1 --pretty=format:"%an <%ae>" "$found_exact")

found_exact_index=-1

for j in "${!SRC_PATCH_HASHES[@]}"; do

if [[ "${SRC_PATCH_HASHES[$j]}" == "$rc_patch_hash" ]]; then

found_exact_index=$j

break

fi

done

if [[ $found_exact_index -ne -1 ]]; then

found_exact="${SRC_COMMITS_ARRAY[$found_exact_index]}"

meta_info="${SRC_COMMIT_META[$found_exact_index]}"

src_subject="${meta_info%__*}"

src_author="${meta_info#*__}"

guggero

Very cool idea for a script!
Have a couple of ideas on how to improve it a bit.

guggero · 2025-07-04T10:43:53Z

scripts/fuzzy-match-release-branch.sh

+if [[ "$RELEASE_LIMIT" -gt 0 ]]; then
+  RELEASE_COMMITS=$(echo "$RELEASE_COMMITS" | head -n "$RELEASE_LIMIT")
+fi
+RELEASE_COMMITS=$(echo "$RELEASE_COMMITS" | tail -r)


The -r option seems to be BSD specific, it doesn't exist on Linux. Also, doesn't tail by default take the last 10 lines only? Which would mean we only look at the last 10 commits?
Use tac instead, which reverses the order line by line as well?

Replaced with the portable RELEASE_COMMITS=$(echo "$RELEASE_COMMITS" | awk '{ lines[NR] = $0 } END { for (i = NR; i > 0; i--) print lines[i] }')

guggero · 2025-07-04T10:45:51Z

scripts/fuzzy-match-release-branch.sh

+    fi
+  done
+
+  if [[ -n "$best_index" ]]; then


Can we also summarize below how many fuzzy matches we found?

guggero · 2025-07-04T10:53:37Z

scripts/fuzzy-match-release-branch.sh

+for commit in "${SRC_COMMITS_ARRAY[@]}"; do
+  author=$(git log -1 --pretty=format:"%an <%ae>" "$commit")
+  subject=$(git log -1 --pretty=format:"%s" "$commit")
+  meta_key="${subject}__${author}"


Can we take into account the author date as well? Since that shouldn't be changed by a cherry-pick or rebase.

Sure, that's a great idea!

guggero · 2025-07-04T11:07:37Z

scripts/fuzzy-match-release-branch.sh

+  echo ""
+  echo "🔍 Diff of release commit:"
+  echo "---------------------------------------------"
+  git show "$rc_commit" | sed 's/^/    /'


Instead of showing the change of the diff, could we show the difference of the release commit and the closest fuzzy match commit?

This script compares a release branch against a source branch (e.g. master) to verify that all cherry-picked commits are unmodified. It first attempts fast matching using normalized patch hashes. If no exact match is found, it falls back to a fuzzy matching mechanism: - Filters source commits by matching author and commit subject - Compares normalized diffs using diff -u - Selects the closest match based on line difference count Useful for verifying cherry-picks or rebased commits during release processes. Supports scan and compare limits for performance.

bhandras · 2025-07-04T15:14:53Z

@gemini-code-assist review

gemini-code-assist

Code Review

This pull request introduces a bash script for verifying cherry-picked commits between branches, covering both exact and fuzzy matching. The review focuses on improving robustness, performance, and correctness. Key suggestions include using a more robust method for handling commit metadata, optimizing git command usage, enhancing error handling for git fetch, and aligning the fuzzy matching logic more closely with the stated intent.

gemini-code-assist · 2025-07-04T15:16:04Z

scripts/fuzzy-match-release-branch.sh

+  author=$(git log -1 --pretty=format:"%an <%ae>" "$commit" 2>/dev/null) || continue
+  subject=$(git log -1 --pretty=format:"%s" "$commit" 2>/dev/null) || continue
+  authordate=$(git log -1 --pretty=format:"%ai" "$commit" 2>/dev/null) || continue
+  meta_key="${subject}__${author}__${authordate}"


Spawning three separate git log processes for each commit is inefficient. This can be combined into a single git log call. Also, using __ as a delimiter is not robust, as commit subjects or author names may contain this sequence, causing parsing issues later on. Consider using a single git log call with a more robust delimiter like the unit separator (%x1f).

Suggested change

author=$(git log -1 --pretty=format:"%an <%ae>" "$commit" 2>/dev/null) || continue

subject=$(git log -1 --pretty=format:"%s" "$commit" 2>/dev/null) || continue

authordate=$(git log -1 --pretty=format:"%ai" "$commit" 2>/dev/null) || continue

meta_key="${subject}__${author}__${authordate}"

info=$(git log -1 --pretty=format:"%s%x1f%an <%ae>%x1f%ai" "$commit" 2>/dev/null) || continue

IFS=$'\x1f' read -r subject author authordate <<< "$info"

meta_key="$info" # Use the raw, delimited output as the meta key

gemini-code-assist · 2025-07-04T15:16:05Z

scripts/fuzzy-match-release-branch.sh

+    src_subject="${meta_info%%__*}"
+    rest="${meta_info#*__}"
+    src_author="${rest%%__*}"
+    src_authordate="${rest##*__}"


The metadata parsing logic splits on __, which is fragile and will fail if the delimiter appears in the commit subject or author name. If you adopt the suggestion of using a robust delimiter like \x1f (unit separator), update the parsing logic here to match.

Suggested change

src_subject="${meta_info%%__*}"

rest="${meta_info#*__}"

src_author="${rest%%__*}"

src_authordate="${rest##*__}"

IFS=$'\x1f' read -r src_subject src_author src_authordate <<< "$meta_info" # Parse metadata using the robust delimiter

gemini-code-assist · 2025-07-04T15:16:05Z

scripts/fuzzy-match-release-branch.sh

+echo ""
+
+echo "🔄 Fetching latest refs..."
+git fetch --all --quiet || true


Using || true after git fetch will suppress any errors. If the fetch fails (e.g., due to network issues), the script will continue with potentially stale data, leading to incorrect results. It's safer to remove || true and let the script exit if the fetch fails, ensuring it always operates on complete data.

Suggested change

git fetch --all --quiet || true

git fetch --all --quiet

gemini-code-assist · 2025-07-04T15:16:05Z

scripts/fuzzy-match-release-branch.sh

+  fuzzy_candidates=0
+
+  for j in "${!SRC_COMMIT_META[@]}"; do
+    if [[ "${SRC_COMMIT_META[$j]}" == "$meta_key" ]]; then


The fuzzy matching logic requires the author, subject, and author date to be identical to find a candidate. This is stricter than intended. Matching on authordate can be too strict, especially for rebased commits where the author date might be updated. Consider making the match less strict by not comparing the date.

gemini-code-assist · 2025-07-04T15:16:05Z

scripts/fuzzy-match-release-branch.sh

+    match_author=$(git log -1 --pretty=format:"%an <%ae>" "$match_commit")
+    match_subject=$(git log -1 --pretty=format:"%s" "$match_commit")


These git log calls are redundant. The author and subject for the match_commit have already been fetched and are available in the ${SRC_COMMIT_META[$best_index]} variable. Parse this information directly from the metadata array instead.

Suggested change

match_author=$(git log -1 --pretty=format:"%an <%ae>" "$match_commit")

match_subject=$(git log -1 --pretty=format:"%s" "$match_commit")

meta_info="${SRC_COMMIT_META[$best_index]}"

IFS=$'\x1f' read -r match_subject match_author _ <<< "$meta_info" # Parse metadata; discard date

gemini-code-assist

Code Review

The pull request introduces a script to verify cherry-picked commits between release and source branches using both fast and fuzzy matching. The script appears well-structured and addresses a useful need. However, there are several areas where the script can be improved, including the hashing algorithm, fuzzy matching logic, character encoding, and error handling.

gemini-code-assist · 2025-07-04T15:22:21Z

scripts/fuzzy-match-release-branch.sh

+  if command -v md5sum >/dev/null 2>&1; then
+    md5sum | awk '{print $1}'
+  else
+    md5 | awk '{print $NF}'
+  fi
+}


The script uses md5sum or md5 for hashing. MD5 is considered cryptographically broken and should not be used for security-sensitive applications. While this script isn't directly security-sensitive, consider using SHA256 (sha256sum) for better collision resistance, if available, or at least warn the user that MD5 is being used and may not be reliable for large numbers of commits. If sha256sum is not available, the script should exit with an error message.

Suggested change

if command -v md5sum >/dev/null 2>&1; then

md5sum | awk '{print $1}'

else

md5 | awk '{print $NF}'

fi

}

if command -v sha256sum >/dev/null 2>&1; then

sha256sum | awk '{print $1}'

elif command -v md5sum >/dev/null 2>&1; then

echo "WARNING: Using md5sum for hashing. Consider using sha256sum if available." >&2

md5sum | awk '{print $1}'

elif command -v md5 >/dev/null 2>&1; then

echo "WARNING: Using md5 for hashing. Consider using sha256sum if available." >&2

md5 | awk '{print $NF}'

else

echo "ERROR: No hashing program found (sha256sum, md5sum, or md5)." >&2

exit 1

fi

For this specific use case md5 is fine.

gemini-code-assist · 2025-07-04T15:22:21Z

scripts/fuzzy-match-release-branch.sh

+    if [[ "${SRC_COMMIT_META[$j]}" == "$meta_key" ]]; then
+      ((fuzzy_candidates++))
+      diff=$(diff -u <(echo "$rc_patch") <(echo "${SRC_PATCHES[$j]}") || true)
+      score=$(echo "$diff" | grep -vE '^(--- |\+\+\+ )' | grep -c '^[-+]')
+      if [[ "$score" -lt "$best_score" ]]; then
+        best_score=$score
+        best_index=$j
+      fi


The fuzzy matching logic compares commit metadata directly. This can lead to false positives if commit messages are duplicated across different commits. Consider incorporating the commit hash itself into the comparison to reduce the likelihood of false positives. This would involve adding the commit hash to the meta_key and using it in the comparison.

Suggested change

if [[ "${SRC_COMMIT_META[$j]}" == "$meta_key" ]]; then

((fuzzy_candidates++))

diff=$(diff -u <(echo "$rc_patch") <(echo "${SRC_PATCHES[$j]}") || true)

score=$(echo "$diff" | grep -vE '^(--- |\+\+\+ )' | grep -c '^[-+]')

if [[ "$score" -lt "$best_score" ]]; then

best_score=$score

best_index=$j

fi

meta_key="${subject}__${author}__${authordate}__${commit}"

Commit hash is not expected to be the same.

gemini-code-assist · 2025-07-04T15:22:21Z

scripts/fuzzy-match-release-branch.sh

+  fuzzy_candidates=0
+
+  for j in "${!SRC_COMMIT_META[@]}"; do
+    if [[ "${SRC_COMMIT_META[$j]}" == "$meta_key" ]]; then


The script compares the entire SRC_COMMIT_META[$j] with $meta_key. This is prone to errors if there are slight variations in the author or date format. It's better to compare the individual components (subject, author, date) separately to allow for minor discrepancies.

Suggested change

if [[ "${SRC_COMMIT_META[$j]}" == "$meta_key" ]]; then

rc_subject_base=$(echo "$meta_key" | cut -d'_' -f1)

src_subject_base=$(echo "${SRC_COMMIT_META[$j]}" | cut -d'_' -f1)

if [[ "$rc_subject_base" == "$src_subject_base" ]]; then

gemini-code-assist · 2025-07-04T15:22:21Z

scripts/fuzzy-match-release-branch.sh

+  for j in "${!SRC_COMMIT_META[@]}"; do
+    if [[ "${SRC_COMMIT_META[$j]}" == "$meta_key" ]]; then
+      ((fuzzy_candidates++))
+      diff=$(diff -u <(echo "$rc_patch") <(echo "${SRC_PATCHES[$j]}") || true)


The diff command is used without specifying a character encoding. This can lead to incorrect results if the source and release branches use different encodings. Specify the encoding explicitly using the -a option to treat all files as text and force a line-by-line comparison.

Suggested change

diff=$(diff -u <(echo "$rc_patch") <(echo "${SRC_PATCHES[$j]}") || true)

diff=$(diff -au <(echo "$rc_patch") <(echo "${SRC_PATCHES[$j]}") || true)

Encodings are expected to be the same.

gemini-code-assist · 2025-07-04T15:22:21Z

scripts/fuzzy-match-release-branch.sh

+    echo ""
+
+    echo "🔧 Check it manually (patch diff):"
+    echo "    git diff $match_commit $rc_commit -- \$(git show --pretty=\"\" --name-only $rc_commit)"


The git diff command includes a || true at the end. This prevents the script from exiting if the diff command fails (e.g., if the files have been deleted or renamed). However, it also masks potential errors that could indicate a problem with the comparison. Consider removing || true and handling the error explicitly, or logging the error message for debugging purposes.

Suggested change

echo " git diff $match_commit $rc_commit -- \$(git show --pretty=\"\" --name-only $rc_commit)"

git diff "$match_commit" "$rc_commit" -- $changed_files | sed 's/^/ /'

guggero · 2025-07-07T06:10:59Z

I tried running the script and it aborted after the first non-exact match. According to ChatGPT it's the combination of set -euo pipefail and the continue in the loop, causing some variables to be seen as un-declared.
Commenting out set -euo pipefail fixed it for me, but potentially just removing u might also help.

bhandras · 2025-07-07T07:20:40Z

I tried running the script and it aborted after the first non-exact match. According to ChatGPT it's the combination of set -euo pipefail and the continue in the loop, causing some variables to be seen as un-declared. Commenting out set -euo pipefail fixed it for me, but potentially just removing u might also help.

Thanks for the feedback! I think on mac (where i developed and tested), bash's behavior is more forgiving and will continue just fine. Will test in on linux too to make sure it's consistent.

gemini-code-assist bot reviewed Jul 4, 2025

View reviewed changes

bhandras mentioned this pull request Jul 4, 2025

release: create v0.19.2-rc1 branch #9986

Merged

gemini-code-assist bot reviewed Jul 4, 2025

View reviewed changes

guggero reviewed Jul 4, 2025

View reviewed changes

bhandras force-pushed the fuzzy-matching-script branch from d66524a to dc19321 Compare July 4, 2025 13:47

bhandras requested a review from guggero July 4, 2025 13:47

bhandras force-pushed the fuzzy-matching-script branch from dc19321 to 29738cf Compare July 4, 2025 14:09

bhandras force-pushed the fuzzy-matching-script branch from 29738cf to 31dafd3 Compare July 4, 2025 14:39

gemini-code-assist bot reviewed Jul 4, 2025

View reviewed changes

	score=$(echo "$diff" \| grep '^[-+]' \| wc -l \| xargs)
	score=$(echo "$diff" \| grep -vE '^(--- \|\+\+\+ )' \| grep -c '^[-+]')

		author=$(git log -1 --pretty=format:"%an <%ae>" "$commit")
		subject=$(git log -1 --pretty=format:"%s" "$commit")

	author=$(git log -1 --pretty=format:"%an <%ae>" "$commit")
	subject=$(git log -1 --pretty=format:"%s" "$commit")
	IFS=$'\0' read -r author subject < <(git log -1 --pretty=format:"%an <%ae>%x00%s" "$commit")

		rc_author=$(git log -1 --pretty=format:"%an <%ae>" "$rc_commit")
		rc_subject=$(git log -1 --pretty=format:"%s" "$rc_commit")

	rc_author=$(git log -1 --pretty=format:"%an <%ae>" "$rc_commit")
	rc_subject=$(git log -1 --pretty=format:"%s" "$rc_commit")
	IFS=$'\0' read -r rc_author rc_subject < <(git log -1 --pretty=format:"%an <%ae>%x00%s" "$rc_commit")

		match_author=$(git log -1 --pretty=format:"%an <%ae>" "$match_commit")
		match_subject=$(git log -1 --pretty=format:"%s" "$match_commit")

-  if command -v md5sum >/dev/null 2>&1; then
-    md5sum | awk '{print $1}'
-  else
-    md5 | awk '{print $NF}'
-  fi
-}
+  if command -v sha256sum >/dev/null 2>&1; then
+    sha256sum | awk '{print $1}'
+  elif command -v md5sum >/dev/null 2>&1; then
+    echo "WARNING: Using md5sum for hashing.  Consider using sha256sum if available." >&2
+    md5sum | awk '{print $1}'
+  elif command -v md5 >/dev/null 2>&1; then
+    echo "WARNING: Using md5 for hashing.  Consider using sha256sum if available." >&2
+    md5 | awk '{print $NF}'
+  else
+    echo "ERROR: No hashing program found (sha256sum, md5sum, or md5)." >&2
+    exit 1
+  fi

-    if [[ "${SRC_COMMIT_META[$j]}" == "$meta_key" ]]; then
+    rc_subject_base=$(echo "$meta_key" | cut -d'_' -f1)
+    src_subject_base=$(echo "${SRC_COMMIT_META[$j]}" | cut -d'_' -f1)
+    if [[ "$rc_subject_base" == "$src_subject_base" ]]; then

	diff=$(diff -u <(echo "$rc_patch") <(echo "${SRC_PATCHES[$j]}") \|\| true)
	diff=$(diff -au <(echo "$rc_patch") <(echo "${SRC_PATCHES[$j]}") \|\| true)

	echo " git diff $match_commit $rc_commit -- \$(git show --pretty=\"\" --name-only $rc_commit)"
	git diff "$match_commit" "$rc_commit" -- $changed_files \| sed 's/^/ /'

scripts: add cherry-pick verification tool with fuzzy matching #10034

Are you sure you want to change the base?

scripts: add cherry-pick verification tool with fuzzy matching #10034

Uh oh!

Conversation

bhandras commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guggero left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bhandras commented Jul 4, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

bhandras commented Jul 4, 2025 •

edited

Loading