Home > OS >  How to modify deeper git history and keep recent commit structures
How to modify deeper git history and keep recent commit structures

Time:06-08

I am attempting to clean up the git history in a repo I'm working on. Given a git history that looks like this:

        . -- .
       /       \
... - S   ...   T - ... H
       \       /
        . -- . 

That is:

  • There is some arbitrary DAG behind point S.
  • There is some arbitrary DAG after point T.
  • There is some arbitrary DAG between point S and T.
  • The graph before S is disjoint from the graph after T (i.e. removing T or S will disconnect the removed nodes predecessors from H).

I would like to rewrite the history between point S and T (e.g. squash or linearize), such that there is some new git history that ends with point T'.


... - S -- ... -- T'

The critical constraint is that the contents of the repo at point T and T' are exactly the same, even though the git commit is different and the way we got from S to T' might have changed.

This much I can do. What I would like to do after this (and i haven't had luck doing so yet) is to transplant the exact structure of the DAG inclusively between T and H to get:


... - S -- ... -- T' - ... H'

Of course the commit hashes will change, but what's important is that the graph structure, authors, and other meta data between T' and H' is the same.

I would have though I could do this with a cherry-pick:

git cherry-pick T^..H

but this seems to result in merge conflicts. I was looking for answers in this SO post: enter image description here

As an example I want to squash the commits between Point1 and Point2, and then apply the rest of the history after Point2.

I can do the squash like this:

    # Squash all information between point1 and point2
    git checkout Point1
    git reset --hard Point2
    git reset --soft Point1^
    git commit -am "all changes between point1 and point2"
    git tag "Point2_prime"

which gives us this:

enter image description here

But I can't figure out how to get the rest of the history on top of it. This is what I've tried so far:

    # The state is now guarenteed to be the same as Point2, but the history has
    # been modified to our liking. Now we need to replay all the other commits
    # on top of this.

    # Based on answers in this SO post:
    # https://stackoverflow.com/questions/1994463/how-to-cherry-pick-a-range-of-commits-and-merge-them-into-another-branch

    COMMIT_A=$(git rev-list -n 1 Point2)
    COMMIT_B=$(git rev-list -n 1 main)
    echo "COMMIT_A = $COMMIT_A"
    echo "COMMIT_B = $COMMIT_B"

    # I've tried the following, but they do not seem to work.

    # Try with cherry pick
    git cherry-pick "${COMMIT_A}..${COMMIT_B}" 

    # Try with rebase onto
    git rebase "$COMMIT_A" "$COMMIT_B"~0 --onto HEAD

I would think because the state of the new commit is exactly the same as the state at Point2, there would be a way to do this non-interactively without merge errors. Is this possible?

CodePudding user response:

The simplest way is to use the git replace git filter-repo trick, documented here :

Parent rewriting

To replace $commit_A with $commit_B (e.g. make all commits which had $commit_A as a parent instead have $commit_B for that parent), and rewrite history to make it permanent:

git replace $commit_A $commit_B
git filter-repo --force

In your case :

git replace Point2 <sha of "all changes between point1 and point2">
git filter-repo --force

If you don't have git-filter-repo installed, the older git filter-branch command also "persists" replacement objects :

# run a phony filter-branch command, you just want to have the
# "rewrite replaced commits" effect:
git filter-branch --tag-name-filter cat main

# you can instruct filter-branch to ignore commits before Point1:
git filter-branch --tag-name-filter cat ^Point1 main

# to have git-filter-repo try to rewrite all branches :
git filter-branch --tag-name-filter cat -f ^Point1 --branches

CodePudding user response:

You could use the option to the rebase command called --rebase-merges. This will (attempt to) preserve the graph by recreating the merge commits. Note the "attempt" cannot automatically resolve conflicts, as stated in the documentation:

Any resolved merge conflicts or manual amendments in these merge commits will have to be resolved/re-applied manually.

So, once you've created the squashed T', you can simply run this command:

git rebase T H --onto T' --rebase-merges

If you didn't have merge conflicts in the original structure, this should work without issues. However, if you have more than just a few merge conflicts to resolve, then you'll probably be far better off using git-filter-repo as described in LeGEC's answer.

  •  Tags:  
  • git
  • Related