According to its man page, git cherry
does some testing to determine if a commit should be cherry picked into another branch:
The equivalence test is based on the diff, after removing whitespace
and line numbers. git-cherry therefore detects when commits have been
"copied" by means of git-cherry-pick(1), git-am(1) or git-rebase(1).
I want create a script to further minimize the list of cherry-pick candidates by removing all commits that would change only a certain file.
E.g. if cherry picking commit 1 with changed files A, B, C into my branch would change only file A while B and C would already contain the changes, I want the script to remove the commit from the list of candidates.
Is there an easy way to get this information out of Git?
CodePudding user response:
further minimize the list of cherry-pick candidates by removing all commits that would change only a certain file
You can use git diff-tree -p $commit | git apply --exclude=path/to/that/file --numstat
, if that lists any changes, the commit has changes in other files, but it's not clear what "would change" means here. "Would change", if you cherry-picked it again regardless of whether its changes outside that file have already been applied?
The only way to do that is to do a test run of the actual apply. You can automate that check, but you're leaving a lot of questions open here. git diff-tree -p $commit | git apply --exclude=path/to/that/file -3
, then git diff --name-only
to see if there's any changes pending, then git reset --hard
before doing or not doing the whole cherry-pick.
But cherry-picking a commit could make changes to the current upstream tip regardless of whether it's already been cherry-picked, if subsequent work reverted it or amended it. So if you don't care whether you're re-applying subsequently reverted changes, why are you starting from the git cherry
list at all? Something isn't making sense here.
CodePudding user response:
(depending on how you interpret "easy" ...)
git
has a patch-id
command command, that builds a hash from a diff, with some rules (ignoring line numbers and whitespaces) to try to have the property : two patches that introduce the same changes will have the same patch-id.
You can use this in the following way :
for each commit :
choose a way to generate the patch it produces on files other than
that/file
run that patch through
| git patch-id --stable
once you have computed all patch-ids in your range of commits, you can compare them to prune commits from your list