I am currently learning with GitHub Actions. My goal is to build a deplyoment pipeline for a small private web project (website). And independently of that I noticed something I don't understand.
Best I explain it with an example. But first my setup:
- remote repository "Project_X" is on GitHub
- GH Action starts a SSH connection to the remote server on every push to initialize a
git pull
there. - done!
This works fine so far.
Now I tested what would happen if my last push contains an error and I want to undo it. So that the page continues to run and I can do the BugFix.
So I entered local: git reset --hard hash_from_prev_commit
.
Locally, the commit was reset. With git push -f
the remote repository was also updated. But on the remote server it was not reset. GitHub Action output:
out: /www/htdocs/w019db06
out: On branch master
out: Your branch is up-to-date with 'origin/master'.
out: nothing to commit, working directory clean
err: From github.com:Project-X/git-workflow-test
err: ffd263c...e5a9ec8 master -> origin/master (forced update)
out: Already up-to-date.
==============================================
✅ Successfully executed commands to all host.
Why is this and what do I have to do so that the change takes effect on the remote server?
CodePudding user response:
TL;DR
You need to get the third repository to run git fetch
followed by git reset --hard
, as in:
git fetch origin
git reset --hard origin/master
if you're willing to hard-code the name master
here.
Long
Let me summarize here by saying that there are three repositories. Let's give them names: Alfred, Barbara, and Catwoman. Repository A (Alfred) is on your laptop. Repository B (Barbara) is on GitHub. Repository C is on your server, whatever that may be.
For simplicity, actions taken on A, by some Git process run because you ran some git whatever
command on your laptop, are "Git A"; those on B (on GitHub) are "Git B", and those that occur on C are "Git C". This should help keep straight what happens on each machine, in each repository.
This line, plus the next one that I won't quote yet:
ffd263c...e5a9ec8 master -> origin/master (forced update)
is printed by Git C, because Git C is running git pull
. But git pull
means: run git fetch
, then run a second Git command.
Background (a bit long, skim if you like)
Now, before we proceed, we need to make a few more notes:
Git is all about commits. It's not about branch names or files: the branch names help some Git repository find commits, and the commits contain files, but we're really concerned with the commits.
Commits are shared: if repositories A, B, and C are cross-connected to each other, they eventually all wind up with the same commits.
Commits are numbered, with big ugly hash IDs. Normally I like to use single uppercase letters to stand in for these, and I can do that here as long as I start deeper in the alphabet (A, B, and C are the repositories on the various machines, so I want to avoid those letters).
Each commit contains, as part of its metadata, the raw hash ID of some previous commit or commits. Most commits have just one previous hash ID, which forms the commits into backwards-looking chains:
... <-F <-G <-H
where
H
stands in for the hash ID of the latest commit in the chain. CommitH
, being the latest, is the end of the chain: there are no more commits after this point.As noted in item 1, a Git repository will use branch names to help it find commits. But the branch names are not shared: each Git repository has its own branch names.
Now, being a bit lazy, I tend to draw the commits like this:
...--F--G--H <-- master (HEAD)
Here we see how the current branch name—in this case master
—points to the last commit in the chain. The process of making a new commit is done, in Git, by checking out some commit (H
), using some branch name (master
), which makes that branch the current branch and that particular commit the current commit. Then we act on the files in our working tree, use git add
to "stage" them, and run git commit
, and Git builds a new commit I
, that points back to the current commit:
...--F--G--H <-- master (HEAD)
\
I
As the last step of git commit
, Git writes I
's actual hash ID—some big ugly hexadecimal number—into the name master
:
...--F--G--H
\
I <-- master (HEAD)
and now the chain of commits ends not at H
but at I
.
When we use git push
or git fetch
, our Git calls up some other Git and either sends them our new commits (git push
) or gets any new commits from them (git fetch
). This is how the commits get shared. But after that, things get a little weirder: if we're using git fetch
, our Git updates, not a branch name, but rather a remote-tracking name. For instance, if we got a new commit J
from their Git, we would now have:
...--F--G--H--I <-- master (HEAD)
\
J <-- origin/master
We can now add commit J
to our branch (our master
, which is independent of their master
) using various Git operations, to get:
...--F--G--H--I--J <-- master (HEAD), origin/master
For various reasons, git push
is different: after we send our new commit to some other repository, we ask them (politely) if they will please, if it's OK, set their branch name to some particular commit. For instance, if they have only commits up through H
, we can send them commit I
, or commits I-J
, that add on to their H
like this, and then ask them to set their master
.
The problem
Let's review what kicks all this off. You, on your laptop—on Git A—are going to run:
git reset --hard HEAD~
or equivalent. This moves your branch name master
back one commit. But we don't get there all at once.
Let's say you started with:
...--G--H <-- master (HEAD), origin/master
You then added commit I
in your repository A
:
...--G--H <-- origin/master
\
I <-- master (HEAD)
You then ran git push origin master
. That sent new commit I
to repo B, and then asked them to make their master
point to I
, which they did:
...--G--H--I <-- master (HEAD), origin/master
You then had repo B tickle repo C, which ran git pull origin master
or just git pull
(both do the same thing). Repo C called up repo B and got what to repo C was new: commit I
:
...--G--H <-- master (HEAD)
\
I <-- origin/master
The second command that your git pull
ran on C was git merge
. This discovered that there was no need for a true merge, so it just added commit I
to its master
:
...--G--H--I <-- master (HEAD), origin/master
So now, repos A and C match (down to both having an origin/master
name pointing to commit I
, as copied from repo B and renamed). Repo B has its master
pointing to commit I
.
And now we reset commit I
away on repo A. (Whew, finally.) Let's draw that:
...--G--H <-- master (HEAD)
\
I <-- origin/master
Repo A still remembers that repo B is using commit I
as repo B's master
. That's true!
You might now run git push origin master
from A
and get an error about it not being a fast-forward:
! rejected ... (non-fast-forward)
That's because your Git talked with their Git, found that you had no commits to send, and ended with a polite request that they set their master
to point to H
. That's a polite request that they forget commit I
, and they say no.
So you resort to git push -f
or equivalent. This changes the final step of git push
from a polite request, please set your master
to point to H
, to a forceful command: Set your master
to point to H
! If they obey, this happens in repo B:
...--G--H <-- master (HEAD)
\
I ???
Note that repo B has no name for commit I
any more. You can still access it by raw hash ID (GitHub allow you to do this with their APIs and web interface), but you have to know the hash ID.
Now, this git push
onto GitHub triggers Repo A to do a git pull
, which—again—is just git fetch
followed by a second Git command, in this case, git merge
. Repo C currently has this:
...--G--H--I <-- master (HEAD), origin/master
They run git fetch origin
and see that, in repo B (on GitHub), master
selects commit H
. This means that they can make their origin/master
point to commit H
, but only by doing a forced update: it's a "non-fast-forward", in Git jargon.
So, they do that. Now they have this:
...--G--H <-- origin/master
\
I <-- master (HEAD)
That is, Repo C still has commit I
. It's right there at the end of their master
.
Their Git now runs its second step for git pull
, which is to run git merge origin/master
(more or less—it actually works off the raw hash ID at this point, internally). That tells them to add commit H
into their chain that ends at commit I
. Commit H
is already in this chain, so they print:
Already up-to-date.
and then do nothing.
What's wrong is now obvious (well, as clear as things get in Git)
The problem here is that git merge
—the default second command of git pull
—is just wrong. You don't want to merge new commits in, on repo C. You want to switch to the latest commit on some branch on repo B. That's recorded in origin/master
, in the case of repo B's master
.
The git pull
command can be told to run a different second command: instead of git merge
, you can have it use git rebase
. But that's equally wrong, or maybe even "wronger" (if that's a thing). You don't want either of those.
Git isn't a deployment tool
The ultimate problem here is that you're treating Git as a deployment tool, and it just isn't one. It can be used as one, much as a screwdriver can be used as a chisel, or a drill press can be used to tighten screws. It's just an abuse of the tool, to some extent. You need to be sure you know exactly what you're doing.