Home > Software engineering >  Merging changes from an "embedded" upstream project
Merging changes from an "embedded" upstream project

Time:11-09

I have a git repository that contains several files that come from an upstream source. I have a few local modifications to these files, but they are largely the same as the upstream versions, and I would like to be able to stay in sync with upstream releases. I don't need upstream history, but if there is a new release, I'd like to be able to merge that in while still keeping my own changes. As a result, it's not as simple as just copying the upstream files into my repository, because that will result in my changes being lost, and it's a real pain to manually run vimdiff or something similar to ensure my changes get added.

Right now I've come to a solution that looks like this:

  1. Create an orphan branch that is completely empty (call this upstream)
  2. Add the upstream files to this branch (so they're the only thing in it)
  3. Merge upstream into my main branch, passing --allow-unrelated-histories
  4. Apply my changes to the files and commit

Now I should be able to bring in changes to upstream and continue merging that, while keeping my changes intact. It seems to work but feels hacky. Is there a more appropriate solution to this problem?

Edit:

Here's a scenario that mimics what I'm doing: there is a header-only C library available for download somewhere. It's not in a Git repository, it's just a bare file somewhere that's periodically updated. I'm using that file, but I have some local changes to it. I want to be able to track changes in future downloads while still keeping my changes to the file (with conflict resolution when necessary). I want the file part of my repository, so I don't want to have downloading and patching be a part of the build process. I'd prefer to use Git to do the merging/conflict resolution.

CodePudding user response:

Perhaps it feels hacky because of having a persistent unrelated branch in your repo. Although it's (in my experience) abnormal, it is a pretty good representation of what you're trying to achieve (having a third-party relation without depending on a separate repo). Given that, I don't see an issue with your proposed solution. Some notes though:

  1. You should only need to use --allow-unrelated-histories the first time, since from then on they will be related.
  2. You shouldn't have to "apply your changes" each time you merge. Instead it would be "resolve conflicts" if there are any. The initial merge may be the most complex, but after that it should be simpler, perhaps even automatic.

The assumption is that every time you wish to update the third party file(s), switch to the upstream branch, drop in the files, commit with a message describing the version and/or date of the files, and then switch back to your working branch and merge in upstream and resolve any conflicts. That's pretty clean, IMHO.

Optional Tweak: in larger repos, switching to an empty branch could be time consuming as it has to delete all the files in your repo, and then write them all again when you switch back to your working branch. Another alternative is, instead of an orphan empty branch in the same repo, is to put that branch in a separate mini repo locally which mimics the directory structure of the files in question. In that case you could have just a single branch called "main" that mimics the upstream branch in your proposed solution. Then in your main repo you can setup up a secondary remote (perhaps called "upstream") to that local repo, fetch from it, and then merge in upstream/main into your working branch. This may solve the hackiness problem as well, but it does violate your constraint of depending on another repo. At least that repo in this case is your own though.

CodePudding user response:

Git rebase command is actually made for that purpose.

Let's say you have a branch with your local changes called mybranch.
And let's say the remote branch where the new changes exist is called master (or main).
So, if you want to include all the changes of the remote master branch to your branch, and meanwhile keep your changes on top of it, first, make sure you're on your current branch (mybranch) and rebase it like this:
git rebase master

For more info on how the rebase works see this article.

  •  Tags:  
  • git
  • Related