I am attempting to clean up the git history in a repo I'm working on. Given a git history that looks like this:
. -- .
/ \
... - S ... T - ... H
\ /
. -- .
That is:
- There is some arbitrary DAG behind point S.
- There is some arbitrary DAG after point T.
- There is some arbitrary DAG between point S and T.
- The graph before S is disjoint from the graph after T (i.e. removing T or S will disconnect the removed nodes predecessors from H).
I would like to rewrite the history between point S and T (e.g. squash or linearize), such that there is some new git history that ends with point T'.
... - S -- ... -- T'
The critical constraint is that the contents of the repo at point T and T' are exactly the same, even though the git commit is different and the way we got from S to T' might have changed.
This much I can do. What I would like to do after this (and i haven't had luck doing so yet) is to transplant the exact structure of the DAG inclusively between T and H to get:
... - S -- ... -- T' - ... H'
Of course the commit hashes will change, but what's important is that the graph structure, authors, and other meta data between T' and H' is the same.
I would have though I could do this with a cherry-pick:
git cherry-pick T^..H
but this seems to result in merge conflicts. I was looking for answers in this SO post:
As an example I want to squash the commits between Point1 and Point2, and then apply the rest of the history after Point2.
I can do the squash like this:
# Squash all information between point1 and point2
git checkout Point1
git reset --hard Point2
git reset --soft Point1^
git commit -am "all changes between point1 and point2"
git tag "Point2_prime"
which gives us this:
But I can't figure out how to get the rest of the history on top of it. This is what I've tried so far:
# The state is now guarenteed to be the same as Point2, but the history has
# been modified to our liking. Now we need to replay all the other commits
# on top of this.
# Based on answers in this SO post:
# https://stackoverflow.com/questions/1994463/how-to-cherry-pick-a-range-of-commits-and-merge-them-into-another-branch
COMMIT_A=$(git rev-list -n 1 Point2)
COMMIT_B=$(git rev-list -n 1 main)
echo "COMMIT_A = $COMMIT_A"
echo "COMMIT_B = $COMMIT_B"
# I've tried the following, but they do not seem to work.
# Try with cherry pick
git cherry-pick "${COMMIT_A}..${COMMIT_B}"
# Try with rebase onto
git rebase "$COMMIT_A" "$COMMIT_B"~0 --onto HEAD
I would think because the state of the new commit is exactly the same as the state at Point2, there would be a way to do this non-interactively without merge errors. Is this possible?
CodePudding user response:
The simplest way is to use the git replace git filter-repo
trick, documented here :
Parent rewriting
To replace
$commit_A
with$commit_B
(e.g. make all commits which had$commit_A
as a parent instead have$commit_B
for that parent), and rewrite history to make it permanent:git replace $commit_A $commit_B git filter-repo --force
In your case :
git replace Point2 <sha of "all changes between point1 and point2">
git filter-repo --force
If you don't have git-filter-repo
installed, the older git filter-branch
command also "persists" replacement objects :
# run a phony filter-branch command, you just want to have the
# "rewrite replaced commits" effect:
git filter-branch --tag-name-filter cat main
# you can instruct filter-branch to ignore commits before Point1:
git filter-branch --tag-name-filter cat ^Point1 main
# to have git-filter-repo try to rewrite all branches :
git filter-branch --tag-name-filter cat -f ^Point1 --branches
CodePudding user response:
You could use the option to the rebase
command called --rebase-merges
. This will (attempt to) preserve the graph by recreating the merge commits. Note the "attempt" cannot automatically resolve conflicts, as stated in the documentation:
Any resolved merge conflicts or manual amendments in these merge commits will have to be resolved/re-applied manually.
So, once you've created the squashed T'
, you can simply run this command:
git rebase T H --onto T' --rebase-merges
If you didn't have merge conflicts in the original structure, this should work without issues. However, if you have more than just a few merge conflicts to resolve, then you'll probably be far better off using git-filter-repo
as described in LeGEC's answer.