Home > Back-end >  I have only one commit in a GIT branch. How to delete that one commit?
I have only one commit in a GIT branch. How to delete that one commit?

Time:10-22

I want to delete a commit from a branch that contains only one commit.

I have tried to delete that one commit using,

git reset --soft HEAD~1
git push origin  dev --force

I was able to delete all the commits. But unable to delete the last commit.

CodePudding user response:

First, you should checkout to your branch git checkout yourbranch

Second, look at the output of command git log --oneline that shows you the list of commits in format [HASH] commit message. You should copy the commit hash of commit before (below in the list) the one you want to delete.

Third, do git reset --hard [copied hash] and your branch will be reseted to this previous commit.

Now, if you want, you may push force the branch to server.

CodePudding user response:

get the hash of your commit using git log

then go to another branch the main one most likely git checkout main

put your commit there temporary

git cherry-pick [commit-hash]

now you can simply delete your branch

// delete your branch locally
git branch -d [branch-name]

// delete your  branch remotely
git push origin --delete [branch-name]

now you can use

git reset --soft HEAD~1

and recreate your branch

CodePudding user response:

[Using git reset,] I was able to delete all the commits. But unable to delete the last commit.

You literally can't delete any commits. You weren't actually deleting them, you were just making them hard to find. The reason you couldn't do that for the last commit is simple: a branch name always selects some commit. There's no such thing as an empty branch, in Git.

The fact is that in Git, branch names don't really matter. They make it possible for us (humans and Git both) to find commits, but it's the commits that matter. To understand this properly, we have to look at how Git "sees" commits.

A Git repository is, at its heart, a collection of databases:

  • There's one database, usually the largest by far, that stores commits and other supporting objects. This is what git clone copies: a clone is a copy of the big database.

    (This database is implemented as a database. It is a simple key-value store in which the keys are hash IDs or object IDs, those big ugly random-looking things that git log prints as commit numbers. There's more than one type of internal object, but commits are the ones you see all the time.)

  • There are some smaller ones that hold names and such. These are currently kind of cheesy, crappy implementations, not proper databases, but they mostly work like simple key-value stores where the name is the key and the value is one of those big ugly hash IDs.

What this means is that each name—in this case, each branch name—just holds one hash ID, for one commit.

It's the commits that form the actual ... well, the usual word here is branches, but we're trying to define branch, so let's avoid that word and just say "the stuff we care about". We know, based on the above, that each commit is numbered with a hash ID. These hash IDs are unique to the commit: no two commits can have the same hash ID.1 The rest of what we need to know is this:

  • Each commit stores two things: data—a full snapshot of every file—and metadata.

  • In the metadata, Git stores the hash ID of some previous commit (or commits, plural, for merge commits, or for at least one special case, no previous commit).

This makes the commits form backwards-looking chains. That is, suppose we have a tiny repository with just three commits in it. These three commits have random-looking hash IDs, but to draw them we'll just use single uppercase letters: A for the first commit we made, B for the second, and C for the third.

Commit C, inside this Git repository, stores the actual hash ID (whatever it is) of commit B, which existed at the time we made C. So we say that C points to B:

    B <-C

But commit B stores, inside itself, the actual hash ID of earlier commit A. So B points to A:

A <-B <-C

Commit A, being the very first commit ever, doesn't store any earlier hash ID. (Git calls this a root commit.) It just stands alone.

To use these three commits, we need to be able to find their hash IDs. So Git creates, for us, a branch name like main or dev:

A--B--C   <-- dev (HEAD)

Here I've gotten lazy about drawing the internal arrows between commits: that's OK, though, because all parts of every commit are read-only, frozen for all time, including the backwards pointing arrows. Since they can't change, we know they point backwards. The hash ID of some future commit is unknown and unpredictable,2 while that of a past commit is set in stone, so it's OK to carve into stone the hash ID of an old commit.

If we force the name dev backwards one step, here's what happens:

     C
    /
A--B   <-- dev (HEAD)

Note that commit C isn't deleted. It's just that we find commits by having Git turn a name, such as dev, into a hash ID. The hash IDs stored in names can be changed, and now dev finds B, not C. B points backwards to A, so we only see commits B and C.

Doing this one more time produces:

  B--C
 /
A   <-- dev (HEAD)

Again, no commits have gone anywhere, but now we only see commit A.

We can't make it so that we stop seeing A entirely: if we force dev back to commit B, it reappears, but A is still there, and if we force dev back to commit C, all three commits are back, and A is still there. Or, we could make some new commit D, using any of the existing three as its parent. Using A as its parent gives us:

  B--C
 /
A
 \
  D   <-- dev (HEAD)

The root commit, in other words, seems to be pretty special. And in fact it is, but it's not so special that we can't do anything about it. There is a flag, --orphan, that we can give to git checkout or git switch that puts us in a special mode.


1Git guarantees this uniqueness only stochastically. That's why the hash IDs are as big and ugly as they are: with only a 1-in-2160 chance of accidental collision between two Git hash IDs, they're "guaranteed" to be unique. The pigeonhole principle tells us that this approach must fail someday, but the size of the hash puts that day far enough into the future to avoid needing to care about it. Or at least, that's the hope here: but that's another topic entirely.

2The actual hash ID is the output of a cryptographic hash function run over all the (meta)data in the commit. This includes unpredictable inputs, such as the date-and-time-stamp that will go on the next commit. Since the hash itself is sensitive to every input bit, and we don't know what the input bits will be, we don't know what the future hash ID will be either. (This is also why we can't change any of the data in the commit: doing so would ruin the hash. Git verifies that the hash ID we use to retrieve the data matches the hash ID obtained when hashing the retrieved data, and thus automatically detects any error.)


Creating a root commit

Let's consider for a moment the case of a new, totally-empty repository:

[no commits at all]

With no commits at all, how can our initial branch name—main or master or whatever it might be—point to some existing commit?

The answer is: It can't. And the rule is: a branch name must point to some commit. So the branch name can't meet its rule.

Git's solution to this problem is simple: don't create the branch name. This means that we are on some branch, main or master, that does not exist. Git calls this, variously, an orphan branch or an unborn branch.

When we are in this state, running git commit will—if it succeeds—write out a root commit and create the branch:

A   <-- main (HEAD)

Now we're on branch main, which now exists, and now we have our root commit A.

Suppose we've made three commits, and then made a dev branch (which pointed to C too) and then forced dev all the way back to A:

  B--C   <-- main
 /
A   <-- dev (HEAD)

If we'd now like to create a new root commit, we need to create this same on a branch that does not exist yet state. We need an unborn, or orphan, branch:

git checkout --orphan newbranch

Now we can work in the usual way and make a new commit. The new commit will be a new root commit. The existing three commits continue to exist:

  B--C   <-- main
 /
A   <-- dev

but we have another new commit, D, that is on our new branch:

D   <-- newbranch (HEAD)

and newbranch is (still) our current branch.

You can't delete commits, but you can abandon them

Let's take our repository-so-far:

  B--C   <-- main
 /
A   <-- dev

D   <-- newbranch (HEAD)

and force the names dev and main to point to commit D, like this:

git branch -f dev HEAD
git branch -f main HEAD

Now we have:

A--B--C

D   <-- dev, main, newbranch (HEAD)

All the name find commit D. We can now switch to dev or main and delete the name newbranch, if we like: it's not needed for anything any more, as the other two names find commit D.

What about the three A-B-C commits? The answer to that question is: They're still in the repository, but unless you know or can find their hash IDs, you can't even see them. They are abandoned.

Git will—eventually, someday, maybe—garbage collect (git gc) abandoned commits. The details here depend on a lot of factors. Some hosting sites, like GitHub, are very bad at erasing abandoned commits; others may be better at it. On your own laptop, you can force Git to speed up the usual garbage collection, but by default, abandoned commits will stick around for at least 30 days in case you'd like to get them back.

The mechanism that hangs on to "deleted" commits is called the reflog, and git reflog will show you the saved hash IDs. (This is yet another database, or series of databases, in the repository. You shouldn't rely too much on the exact implementation of any of these name-to-ID databases, as the Git core group are working on new ways to handle them now. The old ways worked well enough for a long time—about two decades now—but the strain is showing in places.)

Conclusion

You can't "eject" the last commit from a branch, because a branch name—which is how we find the commits that form the branch (or "DAGlet", depending on what you mean by the word branch in the first place)—must point to some commit. So no branch is ever truly empty.

Usually when we view a branch, we'd like to view some selected subset of that branch: the commits that we can find, starting from the one found by a branch name, and working backwards until ... some point. By choosing the cutoff point carefully, we can pretend we have an empty branch:

...--G--H   <-- main, dev

If we list the commits that are "on" main or dev, it's the same list. If we ask for, say, commits that are on dev, but not on main—which we get with git log main..dev; note the two dots here—then we'll see an empty list. Once we git checkout dev and add new commits:

...--G--H   <-- main
         \
          I--J   <-- dev (HEAD)

then main..dev will select just those two commits, J and I, in Git's usual backwards order.

  • Related