Manually adding a file with a merge-conflict to the git index by setting non-zero stage entry values-CodePudding

I have two files committed in a git repository, an original file and a derived file. derived is based on original but has some modifications (i.e., diff original derived produces some small output).

Whenever I modify the original file, I also want to apply the same changes to derived semi-automatically via a script. There is a useful git command for that called git merge-file that allows me to do exactly that.

I my case I want to apply the changes of original in the index (i.e., original is modified but not committed yet) to the derived file so I do something like this:

git cat-file -p HEAD:original >orignal.base
git merge-file derived original.base original
rm -f original.base

Now derived should have the same changes that were applied to original (and diff original derived should be more-or-less unchanged).

However, there is a chance that a merge conflict occurs, in which case git-merge-file returns a non-zero value, and leaves the output file (derived) with merge-conflict markers (<<<<<<<, ======= and >>>>>>>). This is where my question comes in.

What I want to do, is that in case of a merge conflict instead of having derived be marked as 'modified' in the index, I want it to appear as 'both modified' like when a merge conflict occurs (less like with git merge, more like with git stash apply).

I have looked through all the available git commands, and there are some like git read-tree -m which does almost what I want, but for trees and not for blobs.

So my idea was to use something like git update-index. Reading a bit on how the git index works, like in this question, files in the index have a so called 'stage entries', which is usually 0, e.g. for the above files we would have

$ git ls-files --stage
100644 cd2732a3aeeb97c20b5dc809cc6350fd7fbfb944 0       derived
100644 4b48deed3a433909bfd6b6ab3d4b91348b6af464 0       original

But after a merge conflict (for example after a failed git stash apply it looks like this:

$ git ls-files --stage
100644 cd2732a3aeeb97c20b5dc809cc6350fd7fbfb944 1       derived
100644 68bc18a75908806fd6c9c816b7370f4797a6be15 2       derived
100644 d4cda09e1616383549548d7212cdcb86b4dda596 3       derived
100644 4b48deed3a433909bfd6b6ab3d4b91348b6af464 0       original

with 1 being the base file, 2 being the locally modified file and 3 the file from the stash. The file in the working directory contains the merge-conflict markers that are also produced by git merge-file. But git update-index does not support setting these 'stage values'

So what I want to do is the following, in case git merge-file fails:

Add the file original.base as 1 derived (base)
the file original as 2 derived ('our' modification)
and the original derived file as 3 derived ('their' modification) into the index
(keep the derived file with the merge-conflict markers in the worktree)

The reasons for wanting to do this are:

It is obvious to the user that there is a merge conflict that needs to be resolved
git commit will fail until the conflict is resolved
The user can use git mergetool to resolve the conflict in a GUI application, which is not possible with the normal result of git merge-file

Is there any way to achieve what I want to do

Either by adding the files to the index with a non-zero 'stage entry' as described above
Or with another simpler/better way I couldn't think of to achieve the same result

Thanks.

CodePudding user response：

As bk2204 said in a comment, this is the wrong way to go about things. But let's answer the title question. If you do want to create a conflicted index entry, git update-index is indeed the right (and only) tool for this:

An index entry is, by definition, conflicted / unmerged if and only if it has a nonzero stage number.
Only git update-index offers the ability to insert an index entry with a nonzero stage number.

But git update-index does not support setting these 'stage values' ...

This claim is wrong. However, to set a nonzero staging number is nontrivial. Reading the git update-index documentation closely, we find just one way to do this:

--index-info
Read index information from stdin.

USING --INDEX-INFO

--index-info is a more powerful mechanism that lets you feed multiple entry definitions from the standard input, and designed specifically for scripts. It can take inputs of three formats:

mode SP type SP sha1 TAB path

This format is to stuff git ls-tree output into the index.

mode SP sha1 SP stage TAB path

This format is to put higher order stages into the index file and matches git ls-files --stage output.

[format 3 snipped; boldface above is mine]

To place a higher stage entry to the index, the path should first be removed by feeding a mode=0 entry for the path, and then feeding necessary input lines in the third format.

Note further that because you have to place a hash ID into the index, you must first make sure that the data you want exist as a blob object. To get that, use git hash-object -w -t blob (though you can leave out the -t blob since that's the default).

So what I want to do is the following, in case git merge-file fails:

Add the file original.base as 1 derived (base)

Since original.base already has a hash ID (e.g., 4b48deed3a433909bfd6b6ab3d4b91348b6af464), you can just use that. Let's say that's in $hash1 at this point.

the file original as 2 derived ('our' modification)

This also has an existing hash ID, let's say $hash2.

and the original derived file as 3 derived ('their' modification) into the index

For this, you'll have to derive the file again (I think—I may have mis-read something in the question) and run git hash-object:

hash3=$(git hash-object -w < original-derived-data)

You can now remove the file (here original-derived-data).

(keep the derived file with the merge-conflict markers in the worktree)

For this, nothing is required. Now that you have the three hash IDs:

TAB=$'\t'  # make sure your shell supports this syntax

git update-index --index-info <<end
0 blob $hash1 0${TAB}derived
100644 blob $hash1 1${TAB}derived
100644 blob $hash2 2${TAB}derived
100644 blob $hash3 3${TAB}derived
end

The "here-doc" (<<end) will expand $ variables, since we didn't quote the end-word. The hash in the 0 blob $hash1 0${TAB}derived line is irrelevant: the mode 0 is what we really want here, to erase the entry if there is one. The remaining three lines create the higher stage number entries, using the existing hash IDs for the existing data ($hash1 and $hash2) and the new hash ID for the newly computed-and-saved data ($hash3).

Note that you have, by default, only 14 days to get $hash3 inserted into the index from the time you compute it with git hash-object -w. If your script takes longer than this to run, a git gc might run and delete the object you wrote. Of course, if your script takes 14 days to run, something else is probably very wrong. :-)