I have two files committed in a git repository, an original
file and a derived
file.
derived
is based on original
but has some modifications (i.e., diff original derived
produces some small output).
Whenever I modify the original
file, I also want to apply the same changes to derived
semi-automatically via a script. There is a useful git command for that called git merge-file
that allows me to do exactly that.
I my case I want to apply the changes of original
in the index (i.e., original
is modified but not committed yet) to the derived
file so I do something like this:
git cat-file -p HEAD:original >orignal.base
git merge-file derived original.base original
rm -f original.base
Now derived
should have the same changes that were applied to original
(and diff original derived
should be more-or-less unchanged).
However, there is a chance that a merge conflict occurs, in which case git-merge-file returns a non-zero value, and leaves the output file (derived
) with merge-conflict markers (<<<<<<<
, =======
and >>>>>>>
). This is where my question comes in.
What I want to do, is that in case of a merge conflict instead of having derived
be marked as 'modified' in the index, I want it to appear as 'both modified' like when a merge conflict occurs (less like with git merge
, more like with git stash apply
).
I have looked through all the available git commands, and there are some like git read-tree -m
which does almost what I want, but for trees and not for blobs.
So my idea was to use something like git update-index
. Reading a bit on how the git index works, like in this question, files in the index have a so called 'stage entries', which is usually 0, e.g. for the above files we would have
$ git ls-files --stage
100644 cd2732a3aeeb97c20b5dc809cc6350fd7fbfb944 0 derived
100644 4b48deed3a433909bfd6b6ab3d4b91348b6af464 0 original
But after a merge conflict (for example after a failed git stash apply
it looks like this:
$ git ls-files --stage
100644 cd2732a3aeeb97c20b5dc809cc6350fd7fbfb944 1 derived
100644 68bc18a75908806fd6c9c816b7370f4797a6be15 2 derived
100644 d4cda09e1616383549548d7212cdcb86b4dda596 3 derived
100644 4b48deed3a433909bfd6b6ab3d4b91348b6af464 0 original
with 1
being the base file, 2
being the locally modified file and 3
the file from the stash. The file in the working directory contains the merge-conflict markers that are also produced by git merge-file
. But git update-index
does not support setting these 'stage values'
So what I want to do is the following, in case git merge-file
fails:
- Add the file
original.base
as1 derived
(base) - the file
original
as2 derived
('our' modification) - and the original
derived
file as3 derived
('their' modification) into the index - (keep the
derived
file with the merge-conflict markers in the worktree)
The reasons for wanting to do this are:
- It is obvious to the user that there is a merge conflict that needs to be resolved
git commit
will fail until the conflict is resolved- The user can use
git mergetool
to resolve the conflict in a GUI application, which is not possible with the normal result ofgit merge-file
Is there any way to achieve what I want to do
- Either by adding the files to the index with a non-zero 'stage entry' as described above
- Or with another simpler/better way I couldn't think of to achieve the same result
Thanks.
CodePudding user response:
As bk2204 said in a comment, this is the wrong way to go about things. But let's answer the title question. If you do want to create a conflicted index entry, git update-index
is indeed the right (and only) tool for this:
- An index entry is, by definition, conflicted / unmerged if and only if it has a nonzero stage number.
- Only
git update-index
offers the ability to insert an index entry with a nonzero stage number.
But
git update-index
does not support setting these 'stage values' ...
This claim is wrong. However, to set a nonzero staging number is nontrivial. Reading the git update-index
documentation closely, we find just one way to do this:
--index-info
Read index information from stdin.USING --INDEX-INFO
--index-info
is a more powerful mechanism that lets you feed multiple entry definitions from the standard input, and designed specifically for scripts. It can take inputs of three formats:
mode SP type SP sha1 TAB path
This format is to stuff
git ls-tree
output into the index.mode SP sha1 SP stage TAB path
This format is to put higher order stages into the index file and matches
git ls-files --stage
output.[format 3 snipped; boldface above is mine]
To place a higher stage entry to the index, the path should first be removed by feeding a mode=0 entry for the path, and then feeding necessary input lines in the third format.
Note further that because you have to place a hash ID into the index, you must first make sure that the data you want exist as a blob object. To get that, use git hash-object -w -t blob
(though you can leave out the -t blob
since that's the default).
So what I want to do is the following, in case
git merge-file
fails:
- Add the file original.base as 1 derived (base)
Since original.base
already has a hash ID (e.g., 4b48deed3a433909bfd6b6ab3d4b91348b6af464
), you can just use that. Let's say that's in $hash1
at this point.
- the file original as 2 derived ('our' modification)
This also has an existing hash ID, let's say $hash2
.
- and the original derived file as 3 derived ('their' modification) into the index
For this, you'll have to derive the file again (I think—I may have mis-read something in the question) and run git hash-object
:
hash3=$(git hash-object -w < original-derived-data)
You can now remove the file (here original-derived-data
).
- (keep the derived file with the merge-conflict markers in the worktree)
For this, nothing is required. Now that you have the three hash IDs:
TAB=$'\t' # make sure your shell supports this syntax
git update-index --index-info <<end
0 blob $hash1 0${TAB}derived
100644 blob $hash1 1${TAB}derived
100644 blob $hash2 2${TAB}derived
100644 blob $hash3 3${TAB}derived
end
The "here-doc" (<<end
) will expand $
variables, since we didn't quote the end-word. The hash in the 0 blob $hash1 0${TAB}derived
line is irrelevant: the mode 0
is what we really want here, to erase the entry if there is one. The remaining three lines create the higher stage number entries, using the existing hash IDs for the existing data ($hash1
and $hash2
) and the new hash ID for the newly computed-and-saved data ($hash3
).
Note that you have, by default, only 14 days to get $hash3
inserted into the index from the time you compute it with git hash-object -w
. If your script takes longer than this to run, a git gc
might run and delete the object you wrote. Of course, if your script takes 14 days to run, something else is probably very wrong. :-)