To generalize, suppose I have a project with the following directory. How would I ultimately remove file2.txt after pushing and doing a pull request?
app/someFolder
- file1.txt
- file2.txt
- file3.txt
Suppose my commits are these
Commit 1
file1.txt
Hello World
file2.txt
Cool, Superb
file3.txt
December 2
git add .
git commit -m "commit 1"
git push --set upstream origin someBranchOnRemote
Commit 2
file1.txt
Hello World
Boss Bass
git add .
git commit -m "commit 2"
git push
Commit 3
file3.txt
December 3
git add .
git commit -m "commit 3"
git push
So if I were to do a pull request the files would look like this
file1.txt
Hello World
Boss Bass
file2.txt
Cool, Superb
file3.txt
December 3
Now how would I update the pull request so I can have file2.txt not be included? Suppose the hashes are hash1, hash2 and hash3. The final output I want in the pull request would be
file1.txt
Hello World
Boss Bass
file3.txt
December 3
CodePudding user response:
TL;DR: you want git rebase -i
followed by git push --force
or git push --force-with-lease
. But read the following.
First, a side note: Git itself does not have "pull requests"; those are features of certain hosting sites such as GitHub and Bitbucket. They tend to work similarly on each hosting site, but each site has its own quirks and behaviors here. You may have to adapt this answer for whichever hosting site you're using.
With that out of the way, a PR is a request you make to someone that they merge (or "fetch and merge" = "pull") some commit(s) you've made. In Git, you don't really merge a branch: you actually merge commits. The commits that you will merge, when you run git merge
, are those from some chain of commits, as ended by the last commit in that chain.
That is: commits form chains. Each commit in a chain remembers the raw hash ID of its predecessor commit. We say that a commit points to its parent commit, and we can draw that like this:
... <-E <-F <-G <-H
A branch name then simply provides the raw hash ID of the last commit in the chain, from which Git will find all the previous commits:
...--E--F--G--H <-- branch
When you go to make a pull request, you:
- begin by forking and/or cloning some repository, so that you get all the commits that someone else has;
- create a new branch name, so that you have a name pointing to the last commit that's also one of their commits;
- make new commits so that your branch name advances.
For instance, let's say that their commits go up through (and then stop at) the commit I was drawing above as E
. (By the way, I only stopped drawing arrows between commits out of laziness: commits always point backwards, so any time you see a connecting "line", it's really a backwards-pointing "arrow".)
That is, they have, in their repository, some sequence of commits:
...--D--E <-- somebranch
You now have, in your repository:
...--D--E <-- origin/somebranch
You create a new branch name pointing to commit E
:
...--D--E <-- my-fancy-new-feature, origin/somebranch
Now you make new commits while "on" this new branch:
...--D--E <-- origin/somebranch
\
F <-- my-fancy-new-feature (HEAD)
This is your "hash 1", or "commit 1", that affects three files. Commit F
has all the files in it, as all commits always have a full snapshot of every file, but the files in commit F
are all the same as all the files in commit E
, except for the three that you changed. (Git cleverly de-duplicates identical files, so that this doesn't take very much space, either.)
Now that commit F
exists, you make another new commit G
:
...--D--E <-- origin/somebranch
\
F--G <-- my-fancy-new-feature (HEAD)
This is your "commit 2", which changes only file file1.txt
. Commit G
still has every file, it's just that its copy of file2.txt
matches that of commit F
; its copy of file3.txt
matches that of commit F
; and all its other files match those of commits F
and E
.
Finally, you add commit H
:
...--D--E <-- origin/somebranch
\
F--G--H <-- my-fancy-new-feature (HEAD)
In commit H
you've replaced file3.txt
with a modified file; file1.txt
and file2.txt
matches the copies in commit G
, and so on.
That brings us to your question again:
... how would I update the pull request so I can have
file2.txt
not be included?
Git works on the basis of commits, not files, and your PR says please merge commit H
. To change this, you must either:
- somehow change commit
H
, or - change the PR so that it lists some other commit hash ID, not
H
.
It's literally impossible to change anything about any commit, ever, so the first idea is right out.
Whether it's possible to change the PR so that it lists some other commit, depends on the hosting site. If the hosting site is particularly obnoxious, you might have to close this PR, and open a new one later. But GitHub at least will let you update the PR quite simply.
Your first task, though, is to come up with new commits. You don't want file2.txt
changed, but it was different in commit F
(vs commit E
), so commit F
itself is bad in some way. This means you need a new replacement for commit F
. Let's call this F'
to indicate that it's a lot like F
, but it will have a different raw hash ID.
To get commit F'
, we want to "copy" commit F
without quite committing yet. We'll start by checking out commit E
. We could create another new branch name, but we could also use Git's "detached HEAD" mode, like this:
...--E <-- HEAD, origin/somebranch
\
F--G--H <-- my-fancy-new-feature
Now we'll run, say, git cherry-pick -n
and give Git commit F
's hash ID, or something equivalent: my-fancy-new-feature~2
for instance. Git will copy the effect of F
but not commit anything yet—we'll have some work in progress that we can commit—and now we have a chance to undo the change to file2
, with, e.g., git restore
:
git restore -SW --source=origin/somebranch file2.txt
A quick git status
and git diff --cached
will show that we've now retained the updated versions of file1.txt
and file3.txt
, but gone back to the original file2.txt
from commit E
as found by the name origin/somebranch
.
We can now run git commit
to make F'
:
F' <-- HEAD
/
...--E <-- origin/somebranch
\
F--G--H <-- my-fancy-new-feature
Commit G
affects file1.txt
only, so we can just copy it wholesale, with git cherry-pick
, which will not only figure out what it changed and apply it, but also make a new commit, re-using the original commit's message:
F'-G' <-- HEAD
/
...--E <-- origin/somebranch
\
F--G--H <-- my-fancy-new-feature
You might wonder why we copy G
to G'
, rather than just using G
itself. The answer is simple: nothing about commit G
can ever change. The arrow coming out of G
, pointing to F
, is part of G
. It can't change! Commit G
will forever point back to commit F
, never to commit F'
. So we have to copy G
.
Also, commit G
has the wrong copy of file2.txt
in it, of course, which would also force us to copy it—but anything that forces us to copy the commit, forces the whole thing. Note that when we do "copy" G
with cherry-pick, Git compares the snapshot in G
to that in F
to see what changed. Since file2.txt
in this pair-of-commits did not change, Git won't change file2.txt
in G'
vs F'
. So G'
will have the same file2.txt
as F'
, and F'
has the same file2.txt
as E
.
Now, for the same reasons, we need to copy H
, which we can do with one more git cherry-pick
command. The result is:
F'-G'-H' <-- HEAD
/
...--E <-- origin/somebranch
\
F--G--H <-- my-fancy-new-feature
Now that we have the right commits, all (all?!) we have to do is to get the name my-fancy-new-feature
to point to H'
instead of H
. We can do that in various ways, such as git checkout -B my-fancy-new-feature
or git switch -C my-fancy-new-feature
. The final result here will be:
F'-G'-H' <-- my-fancy-new-feature (HEAD)
/
...--E <-- origin/somebranch
\
F--G--H ???
What happens to the F-G-H
chain, that Git used to find by looking at the name my-fancy-new-feature
? The answer is: nothing happens to it. It's still there. It's just that now, it goes unused. These aren't the droids commits you're looking for, so we just make sure that these aren't the commits we find.
We now have the right commits, locally, in this repository. Now we have to get them to the hosting site, and get the hosting site to update the pull request. To do that on GitHub, we just push the new commits to GitHub, telling the Git over on GitHub to replace the F-G-H
commits in its repository with our new F'-G'-H'
chain.
Git in general is greedy for commits, so if we just run a regular git push origin my-fancy-new-feature
, they—the Git over on GitHub, operating on your repository over there—will reject our attempt to do this. They will say, in effect, No! If I do that I'll lose the F-G-H chain! (As with our own repository, the commits won't be gone, they just won't be findable by the name my-fancy-new-feature
any more. But that's enough for them to reject the request.) You'll likely get a suggestion that you pull
(i.e., fetch and merge) the commits from GitHub: they don't realize that they got them from you in the first place, and that you're telling them these are the new and improved replacements so you should ditch the old ones in favor of these new-and-improved ones.
To make them realize that, you need some kind of forced-push (not Star Wars style "force", but just the regular English-language meaning). Git has several kinds and you can use any of them here, but --force-with-lease
has a safety feature (that shouldn't matter here: if it does, something has gone not-according-to-plan, and the safety feature detects that) and is generally the way to go.
Making this easy(ish)
The sequence above has lots of Git commands in it, many of them tricky (I didn't show the full commands for multiple reasons). We can reduce that to a smaller number of much-less-tricky commands using git rebase -i
. There's still one big bit of trickiness though.
Running:
git switch my-fancy-new-feature
git rebase -i origin/somebranch
is how we start. The rebase operates on the current branch, so we begin by checking out my-fancy-new-feature
(you can use git checkout
or git switch
here, or do nothing if you're already on it).
What rebase does is:
- list out commits to copy (hash IDs);
- use Git's detached HEAD mode to begin copying; and
- start cherry-picking.
Once it's all done, it fixes up the detached HEAD by moving the branch name to the last of the copied commits (H'
in our case). So that automates a lot of the hard work.
Rebase in general is what we use when we have some commits that we mostly like, but there is something about those commits that we don't like. Since nothing about any existing commit can change, rebase works by copying the commits. The new copies can be changed along the way, before we commit them.
The interactive rebase in particular gives us more opportunities for change. A plain rebase
just copies everything without giving us a chance to fix stuff, which is useful for moving commits—for taking a chain like this:
A--B--C <-- topic
/
...--o--o--o--o <-- mainline
and copying it to:
A--B--C ???
/
...--o--o--o--o <-- mainline
\
A'-B'-C' <-- topic
so that the commits now come at the end of the mainline, instead of sprouting from an earlier point. That's not what we want here: we want to change some of the files in one of the commits.
So, interactive rebase, instead of just planning out all the cherry-picks and then starting them, writes out an instruction sheet. This instruction sheet lists the cherry-picks, using the word pick
for each one:
pick hash1 subject
pick hash2 subject
pick hash3 subject
Then, once the instruction sheet is written, git rebase -i
opens an editor on the instruction sheet so that we can change the commands.
In our case, we don't want to just pick commit #1 as-is. We want a chance to change it. So we will change pick
to edit
. We do want to pick #2 and #3 as is, so we'll leave those alone. Then we write out the instruction sheet and exit the editor,1 to return to the cherry-pick action.
Having changed the first pick
to edit
, Git will cherry-pick the first commit but then stop to let us fix it up. There's one thing that's particularly tricky here: Git has actually made a temporary commit at this point, so when we do fix it up, we have to run git commit --amend
.2 We can now do our git restore
as I described earlier, then run git commit --amend
:
git restore -SW --source origin/somebranch file2.txt
git commit --amend
(note: --source=origin/somebranch
and --source origin/somebranch
work the same way here, so you can use either one).
Once we're done fixing up that edit-able commit, we tell Git to resume the rebase:
git rebase --continue
This will finish off all the remaining cherry-picks, then re-arrange the branch name and re-attach our HEAD
to our branch, and now we have what we wanted:
F'-G'-H' <-- my-fancy-new-feature (HEAD)
/
...--E <-- origin/somebranch
\
F--G--H ???
We're now ready to run:
git push --force-with-lease origin my-fancy-new-feature
and if we're talking GitHub, the "update the PR without closing and re-opening" is now done.
We used a total of five or six Git commands, about half of what we needed earlier, and we didn't have to do anything tricky except for the interactive rebase "edit" step. Everything else is pretty straightforward here.
1Some editors don't exit: you start them up early and then they hang around forever. Examples include many cases of Emacs, Sublime, and Atom. If you're using one of these editors, you have to make them interact well with Git; that's a matter for that particular editor, but most of these editors these days have a --wait
flag that arranges all of that to work right.
2The --amend
flag seems to change a commit, contradicting the claim above that we can't change any commit. The dirty secret here is that --amend
doesn't change a commit. Instead, it just makes yet another commit. So when we use --edit
we generate extra, rather pointless, "trash" commits. But commits in Git are so small and cheap that it's better to do this than to avoid it. Git will eventually clean up after itself, though this generally takes more than a month. The cleanup / janitorial stuff that Git does is a bit slow, but sweeping up a month's worth of trash in 5 minutes, rather than cleaning up every bit of trash right away, is actually a highly practical tradeoff.
CodePudding user response:
According to your requirements, I would just delete the file, make a new commit out of it and push it to your branch. That way there would be a 4th commit in the PR removing the file and the final result would be as you say.