I have a bundle that contains git packfile.
From what I could find by now, I initialised an empty git repo and copied that packfile in .git/objects folder.
After that, I executed git unpack-objects to extract the content of the packfile.
Now the question is how can I get the total number of commits that are in that packfile?
I tried using a tool like git-sizer, and I am getting the following output
.git % git-sizer -v
Processing blobs: 8
Processing trees: 3
Processing commits: 1
Matching commits to trees: 1
Processing annotated tags: 0
Processing references: 1
| Name | Value | Level of concern |
| ---------------------------- | --------- | ------------------------------ |
| Overall repository size | | |
| * Commits | | |
| * Count | 1 | |
| * Total size | 177 B | |
| * Trees | | |
| * Count | 3 | |
| * Total size | 567 B | |
| * Total tree entries | 11 | |
| * Blobs | | |
| * Count | 8 | |
| * Total size | 69.3 MiB | |
| * Annotated tags | | |
| * Count | 0 | |
| * References | | |
| * Count | 1 | |
| * Branches | 1 | |
| | | |
| Biggest objects | | |
| * Commits | | |
| * Maximum size [1] | 177 B | |
| * Maximum parents [1] | 0 | |
| * Trees | | |
| * Maximum entries [2] | 5 | |
| * Blobs | | |
| * Maximum size [3] | 52.5 MiB | ***** |
| | | |
| History structure | | |
| * Maximum history depth | 1 | |
| * Maximum tag depth | 0 | |
| | | |
| Biggest checkouts | | |
| * Number of directories [2] | 3 | |
| * Maximum path depth [2] | 3 | |
| * Maximum path length [2] | 65 B | |
| * Number of files [2] | 9 | |
| * Total size of files [2] | 69.3 MiB | |
| * Number of symlinks | 0 | |
| * Number of submodules | 0 | |
[1] 73d9a3662c9e52c39e8efbfa40a48e39f143d72e (refs/heads/master)
[2] 7a244cc36e07929f2714296021c7605daaf28542 (refs/heads/master^{tree})
[3] 6cbe51484efd47cd119ba9d54bc87061cc140b63 (refs/heads/master:objects/pack/pack-c1117410d7ff71062c25e2e4f3dd86ebffca897b.pack)
and that one commit that is displaying is the one that I made, but there should be much much more.. is there any way to get the total commits number?
CodePudding user response:
A Git pack file contains objects, but objects by themselves are not useful.
A Git repository consists of two databases (plus ancillary files, plus other options, but the two databases are the key here):
One database holds all the objects. These may be in a single pack file, in multiple pack files, and/or present as "loose" objects. They're not useful without the other database!
The other database contains names. These names reference objects. For some name-types, the referenced objects may be any of the four types, but for branch and remote-tracking names, the referenced objects must be commits.
A reference to a commit makes that commit "reachable". A reachable commit can be found.
A commit itself contains two things:
Directly, the commit contains metadata. This metadata refers to a single
tree
object, and has a list ofparent
object references (typically one entry long but it can be zero entries long, or more than one entry long).Indirectly, via the
tree
object, the commit contains files: that's the snapshot that goes with the commit.
Because a commit can and usually does refer to some other previous ("parent") commit, a reachable commit makes the referred-to commit also reachable. That is, if the name master
locates commit a123456...
, and commit a123456
says that its parent is b789abc...
, then at least two commits are reachable.
Now the question is how can I get the total number of commits that are in that packfile?
The number of commits in the packfile isn't really interesting: only the number of reachable commits is interesting. There is therefore no easy way to count the specific commit objects. However, if you really want to do that, you can unpack the objects with git unpack-objects
. There are some constraints here so you'd want to do this by unpacking into a new, empty repository, to turn every object that's in the pack into a loose object. It's now easy enough to find every object (find .git/objects -type f -print
and deal with the funky name organization) and then use git cat-file -t
or similar to find the type of each such object (for efficiency, use git cat-file --batch-check
). That will allow you to separate the overall count into individual object-type counts.
You could also use git fsck --unreachable
here, which won't require unpacking the objects first. All the commits will be unreachable after copying the pack file into this temporary, empty Git repository.