Home > database >  How does git "unhash" its internal objects?
How does git "unhash" its internal objects?

Time:08-18

Recently I read about git internals and found that under the hood git hashes its objects:

$ echo 'test content' | git hash-object -w --stdin

d670460b4b4aece5915caf5c68d12f560a9fe3e4

How does it "unhash" its hash objects and the content of it?

$ git cat-file -p d670460b4b4aece5915caf5c68d12f560a9fe3e4

test content

CodePudding user response:

How does it "unhash" its hash objects and the content of it?

It finds the object in the index. If you look in the .git/objects directory, I suspect you'll find a directory called "d6" with a file called 70460b4b4aece5915caf5c68d12f560a9fe3e4. My understanding is that that file contains the content of the object - although probably compressed in some way.

But there's no magic going on of conjuring information out of nothing. (And in particular, if you use that same git cat-file command in a repo which doesn't have that object, it will fail with something like:

fatal: Not a valid object name d670460b4b4aece5915caf5c68d12f560a9fe3e4

CodePudding user response:

Git does not unhash its objects. It uses the hash as a lookup key, just like a hash table.

git cat-file -p d670460b4b4aece5915caf5c68d12f560a9fe3e4

Git uses d670460b4b4aece5915caf5c68d12f560a9fe3e4 to look up the content. It can be in two places, .git/objects/ (aka "loose objects") or a packfile.

In the case above, Git would look for .git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4. If that file exists, then it decompresses it and viola, there's your content.

You can see this yourself by decompressing the file with openssl zlib -d < .git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4.


Periodically, Git will clean up the loose objects into "packfiles". These are binary files which hold the information for a lot of objects. This is more efficient than individual files. These get a bit complicated in the details, but again the SHA1 hash is used to look up the content in the file. Your example might look something like this.

d670460b4b4aece5915caf5c68d12f560a9fe3e4 blob   61 60 285843732

Git uses this information to get the content, and it's complicated exactly how that works. You can read about the gory details if you like.

  •  Tags:  
  • git
  • Related