Home > Enterprise >  How to get the git refs name directly from gitlab server repository?
How to get the git refs name directly from gitlab server repository?

Time:08-16

I'm trying to create an update server-side hook script in a local gitlab server.

From my update hook script, I need to get the commit msg as well as the ref name for the new_commit.

From a remote machine, if I run the following command I get the correct value:

john@machine:/tmp/test-hooks$ git log --format=%d -n 1 632f8f615d5904020b7bf1111d1eda3e163d9af3
 (origin/main, origin/HEAD)

But, inside the gitlab server, running the same command in the repository, I'm getting a different value:

root@71c6482bcfa9:/var/opt/gitlab/git-data/repositories/@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35.git/custom_hooks# git log --format=%d -n 1 632f8f615d5904020b7bf1111d1eda3e163d9af3
 (refs/keep-around/632f8f615d5904020b7bf1111d1eda3e163d9af3)

How can I get the same (origin/main, origin/HEAD) instead of the (refs/keep-around/632f8f615d5904020b7bf1111d1eda3e163d9af3) directly from the gitlab server project repository ?

CodePudding user response:

How can I get the same (origin/main, origin/HEAD) instead of the (refs/keep-around/632f8f615d5904020b7bf1111d1eda3e163d9af3) directly from the gitlab server project repository?

You can't. The reason is very simple: Git's names aren't "global" (this is not a well-defined Git term, so read on to see what I mean).

You have two different repositories involved here. One is your repository, on your laptop. The other is GitLab's repository, on your GitLab server.

A Git repository is mostly made up of two databases, one usually much bigger and one much smaller:

  • The big database holds commits and other objects. The "names" of these objects are hash IDs, such as 632f8f615d5904020b7bf1111d1eda3e163d9af3. These are how Git finds the objects: you must give Git the hash ID. Those are the keys for the database, which is a simple key-value store. But you don't necessarily have to type in a raw hash ID, because ...

  • The small database holds names, such as origin/main or refs/keep-around/632f8f615d5904020b7bf1111d1eda3e163d9af3. That last name is very peculiar—it is probably a special GitLab trick. Most names are either branch names, which start with refs/heads/, or tag names, which start with refs/tags/, or sometimes remote-tracking names, which start with refs/remotes/.1

Now, cloning a repository involves copying the entire big database (or most of it), usually. That's because we need all the commits and their supporting objects, in order to use the repository. (Sometimes we can get away with a limited, or even very-limited, subset of the objects, but that's an optimization to be done later.) But when we clone the repository, we do not copy the names database at all.

Instead, git clone takes each of their branch names and changes that name into a remote-tracking name. This means that if the original repository has refs/head/main—which is a branch name—then our clone ends up with refs/remotes/origin/main, which is a remote-tracking name. This gets summarized in git log output (and other outputs) by dropping the refs/remotes/ part sometimes. Some commands, such as git branch -a, drop only the refs/ part. Why? Nobody seems to know. All we know for sure is that the real name starts with refs/remotes/ and goes on to list origin/ and then the branch name as seen in the other repository, over there on origin.

But that's the basic problem: they have main; we get origin/main. Later, our own Git may create a new branch name in our (laptop) repository and use the name main, but that's a different branch name than their main. Each branch name is local to the repository that contains it.

So, just because we have an origin/main does not mean that they have an origin/main. Moreover, we might have an origin/main because at some point in the past, they used to have a main, but now they don't any more. Our remote-tracking names, in our laptop repository, get stale and out-of-date.

Meanwhile, they have a special (weird / funny) name-space, refs/keep-around/. Our git clone command may be able to see these names, if they choose to list them for us at git clone and git fetch time. But our Git software doesn't put that name into our repository's smaller names database, because it's an unrecognizable name: it doesn't start with refs/heads/ or refs/tags/ and those are the only names our Git look at when their Git tells us that they have them. For other names, our Git software just says "yeah okay but fuggedaboutit!"

In short, then, you can't get what you want, because it simply doesn't exist. When you're using Git, you must understand this. If you don't, you will have the kind of time you're having.

From my update hook script, I need to get the commit msg as well as the ref name for the new_commit.

When someone runs git push, they:

  • deliver some objects for your Git to put in your objects database, if necessary (perhaps you already have all the objects); then
  • ask (politely) or command (forcefully) that your Git software create or update some name(s) in your names database.

The update hook, if you have one, is run with an argument specifying the name that your Git was asked or commanded to create or update. It also gets:

  • the current hash ID associated with that name, if any; and
  • the new hash ID the other Git asked or commanded yours to associate with the name.

That's all you get. At most one of the existing and new hash IDs may be all-zeros, indicating a new name being created or an existing name being deleted.

You cannot see the other Git repository's names, if any, for this commit. Those are in their names database, not in yours. You're being asked or commanded to update your names database. You can pass judgement now, and either allow or deny the operation to proceed.


1Git calls these remote-tracking branch names. I like to drop the word branch from this phrase as it's serving no useful purpose and tends to confuse humans: they think these are branch names, but they're not, they are only remote-tracking (or "remote-tracking-branch") names. That is, they are names, of a "tracking-ish" type, specifically the "remote-tracking" type. The verb track is overused in Git as well, but the things that these names track are some other repository's branch names.

  • Related