So here is a situation that is probably not common. And I would like some thoughts on how to best proceed. I am trying to merge two projects that build upon a common code base. Up to a point. So far so good if the commit history would share the same history. What do I mean by that, well …
Originally the project was a mercurial repository, but at some point bitbucket decided to drop support for mercurial. That was the point at which the original project switched to github. But I had done the switch for the fork I have been maintaining earlier. So now there are two version of the same library that have exactly the same commit history up until the fork happend. But never the less each pair of commits has different hashes so git thinks there is no shared history. The original library has not had any major updates since the fork, just small fixes. My fork has had some major changes. So what to do. The orginal project lead recently contacted me about possible taking over the repository. The reason my fork never upstreamed things was that from what I could tell, the other side just did not have the time. Fair enough I guess, in open source the volunteers are not obligated to anything. If I am going to take over the repository I would like to merge the projects again.
Any thoughts on how best to do this? I am thinking of renaming the master to master-old and using the history of the my fork which has had the major updates applied to be the new master. Afterwards I would cherry pick the patches from master-old that are missing on the fork. But that would probably lead to problems for people when they to a git pull the next time. Is there a better way to deal with this situation?
CodePudding user response:
git replace
the newest commit(s) in your converted history that points to the same tree as one in the older conversion, with the same commits in the older conversion, then git filter-branch
to bake in the updated ancestry.
CodePudding user response:
If you just want to keep the history of one branch (e.g. everything that's happened on "master" of each repository), you can do this as a rebase. Note that git rebase
drops merge commits by default, so this will not preserve the history as well as something like filter-branch
would.
In the manual page for git rebase
, it talks about "the hard case" where the "changes do not exactly correspond to the ones before the rebase". Since this form of rebase specifies the exact sequence of changes to apply, it doesn't matter where the common ancestor is - or even, as in your case, if there is no common ancestor at all.
Let's call one repository "alice" and the other "bob". Assuming you have "alice" checked out, you would start by adding the "bob" repository as a remote. To make things clear, we'll make local branches called "alice-master" and "bob-master":
git remote add [email protected]:bob/example
git branch alice-master master
git branch bob-master bob/master
You now have a single repository with two unconnected histories. Your next job is to find the last revision in the history which both forks have, and note down its commit hashes according to each repository. Set up tags at these commits to refer to them more clearly:
git tag alice-last-shared abc123de
git tag bob-last-shared 987fed76
Now, you can tell git to take the entire history after "bob-last-shared", up to "bob-master", and recreate it on top of "alice-last-shared":
git rebase bob-last-shared bob-master --onto alice-last-shared
Since the files in "alice-master" were identical to the ones in "bob-master", this rebase should apply cleanly. You should now have a "bob-master" and an "alice-master" which share a common ancestor, as though they had always been branches of the same repository. You can now proceed with a normal merge:
git switch alice-master
git merge bob-master
At this point, you're likely to get some conflicts. There's not much to do about that, but pick through them.