Home > Enterprise >  Correctly track and merge in changes from multiple independent upstream git repositories over time (
Correctly track and merge in changes from multiple independent upstream git repositories over time (

Time:08-15

I am working on an open source android application that is derived from, and inherits from, two independent upstream open source android apps over which I have no control. All projects use git for version control.

When either or both upstream apps release new versions, I would like to pull in changes from these apps, resolve conflicts, and release a new version of my downstream app.

How can I do this in a way in which changes will be correctly tracked and there will not be a need to resolve the same conflicts more than once?

For example and to give a concrete example consider the strings.xml file which stores strings used in the app. Clearly, our downstream app will need strings from both upstream projects and new strings we add ourselves.

How can I connect these projects through git so that over time the only changes/conflicts I will need to manually resolve are where a given line has been newly changed in both our downstream project and in one of the upstream projects?

Everything I have tried so far has failed "remember" conflicts I have already resolved in the convenient way I have come to expect from a single upstream origin/repo.

My understanding is that these problems arise when git merges in "un-related" histories (e.g. using the --allow-unrelated-histories option which has seemed to be a requirement so far with my attempts).

So another way to possible state is the problem is there a way my downstream app can have related histories with these two upstream apps that previously resolved conflicts will be remembered and preserved?

Hopefully without having to keep the entire previous history of both projects in my own project repo?

CodePudding user response:

Let's start with the concept of history, as defined by Git. History in Git is nothing more or less than the set of commits in the repository, as found by starting from branch and tag and other such names.

Commits may be directly related with parent/child relationships, indirectly related via grandparent / grandchild / cousin relationships, or unrelated. This is all determined by the commit graph, which is just the transitive closure of the parent/child relationships. (These are stored in the children, in that the children list their parents.)

Add in the fact that all commits have a unique number (their hash IDs), and that no commit can ever be changed: the unique number determines the commit and vice versa, so that if we take a commit out and change anything about it, including stored parent information, what we get is a new and different commit: the original commit remains in the graph undisturbed.

So another way to possible state is the problem is there a way my downstream app can have related histories with these two upstream apps that previously resolved conflicts will be remembered and preserved?

Not exactly, but mostly.1

A repository is one of these collections-of-commits (plus the names that get you into the graph). So if there are two existing "upstream" repositories U1 and U2 where these two repositories are themselves not related, they have two independent (disjoint) graphs. (These disjoint graphs may have their own disjoint subgraphs, but we know there are at least two such graphs.) That is, you get a new graph G = ∪(U1, U2) where, for any commits X, Y selected from the U1 and U2 subset respectively, X ⊀ Y and Y ⊀ X (and of course X ≠ Y).

You can, however, run one git merge --allow-unrelated-histories on your chosen (X, Y) pair. When this merge is committed, you get a new commit Z ∉ the union but where both X and Y are the two parents of Z. So now commits Y and Y are potential merge bases for commits that follow Z.

If you get an update to either U1 or U2, and a new commit in one of those graphs is itself a descendant of X or Y (as appropriate), you can merge that descendent into Z or any of its descendants, and X or Y will be the merge base here.

Repeat for updates to the other repository, and you're as good as you can be.

Hopefully without having to keep the entire previous history of both projects in my own project repo?

While you don't necessarily need to store the entire U1 and U2 graphs, it makes sense to do so, and relatively rarely increases the amount of storage all that much (for some definitions of "relatively rarely" and "all that much"). You will definitely have to store commits X and Y, however you go about choosing them, plus all the commits that connect any new commits backwards to X or Y.


1Go not to the Elves for counsel, for they will say both no and yes.

  • Related