Home > OS >  Find files that would be changed by git cherry
Find files that would be changed by git cherry

Time:11-25

According to its man page, git cherry does some testing to determine if a commit should be cherry picked into another branch:

The equivalence test is based on the diff, after removing whitespace
and line numbers. git-cherry therefore detects when commits have been
"copied" by means of git-cherry-pick(1), git-am(1) or git-rebase(1).

I want create a script to further minimize the list of cherry-pick candidates by removing all commits that would change only a certain file.

E.g. if cherry picking commit 1 with changed files A, B, C into my branch would change only file A while B and C would already contain the changes, I want the script to remove the commit from the list of candidates.

Is there an easy way to get this information out of Git?

CodePudding user response:

further minimize the list of cherry-pick candidates by removing all commits that would change only a certain file

You can use git diff-tree -p $commit | git apply --exclude=path/to/that/file --numstat, if that lists any changes, the commit has changes in other files, but it's not clear what "would change" means here. "Would change", if you cherry-picked it again regardless of whether its changes outside that file have already been applied?

The only way to do that is to do a test run of the actual apply. You can automate that check, but you're leaving a lot of questions open here. git diff-tree -p $commit | git apply --exclude=path/to/that/file -3, then git diff --name-only to see if there's any changes pending, then git reset --hard before doing or not doing the whole cherry-pick.

But cherry-picking a commit could make changes to the current upstream tip regardless of whether it's already been cherry-picked, if subsequent work reverted it or amended it. So if you don't care whether you're re-applying subsequently reverted changes, why are you starting from the git cherry list at all? Something isn't making sense here.

CodePudding user response:

(depending on how you interpret "easy" ...)

git has a patch-id command command, that builds a hash from a diff, with some rules (ignoring line numbers and whitespaces) to try to have the property : two patches that introduce the same changes will have the same patch-id.

You can use this in the following way :

for each commit :

  1. choose a way to generate the patch it produces on files other than that/file

  2. run that patch through | git patch-id --stable

  3. once you have computed all patch-ids in your range of commits, you can compare them to prune commits from your list

  • Related