Home > Back-end >  Convenient way to push to a pull request opened by someone else
Convenient way to push to a pull request opened by someone else

Time:10-06

I'm a maintainer of a project on a GitHub and we continuously get PRs. Usually I use the commands shown by GitHub to get and test the PR:

git checkout -b USER-master master
git pull https://github.com/USER/REPO.git master

However, when I want to push a commit I need to type:

git push https://github.com/USER/REPO.git USER-master:master

I was about to create an alias which can be used like

git pr https://github.com/USER/REPO.git master

It created a new branch (as GitHub suggests) and set that branch tracking the upstream of the PR. It'd allow to simple call git push.

To set the tracking branch I've tried:

git branch -u https://github.com/USER/REPO.git/master

And got:

error: the requested upstream branch 'https://github.com/USER/REPO.git/master' does not exist
hint: 
hint: If you are planning on basing your work on an upstream
hint: branch that already exists at the remote, you may need to
hint: run "git fetch" to retrieve it.
hint: 
hint: If you are planning to push out a new local branch that
hint: will track its remote counterpart, you may want to use
hint: "git push -u" to set the upstream config as you push.

I assume that the many / in the remote's name confuses git as the branch is also separated by a /.

It could worked if I added a new remote with

git remote add USER-REPO https://github.com/USER/REPO.git

but I'd like to avoid adding so many remotes.

Any ideas on the git branch -u?

CodePudding user response:

TL;DR

You'll probably want to set up more than one remote. Once you do, though, things get messy. When and whether to set the upstream of a local branch you made from a PR is up to you; you might not want to do that at all. How fancy you want to make any of your scripts is also up to you. You may end up with a so-called triangular workflow.

Long

This is a bit complicated, so let's start with this: the -u option to git push sets an upstream. Each branch name in your own repository can have one (1) upstream, or no upstream at all; those are your only options.

Git's terminology (and GitHub's for that matter) gets confusing here. The upstream I'm talking about above is a branch setting; git branch --set-upstream-to sets this setting; git branch --unset-upstream removes this setting. In all cases, this setting applies to a branch name in your repository, e.g., in a repository on your laptop. This is a local repository, not one on GitHub.

When we use GitHub, we get things called pull requests. These are not part of Git itself! Git has a command, git request-pull, but it just generates an email message. It doesn't even send the email message, it just prints one out, suitable for you to put into some email-generating software. So GitHub pull requests are specific to GitHub.

When we do get pull requests, though, we often want our own Git, on our laptop, working with our repository, to connect to GitHub. We may store a copy of our repository on GitHub: https://github.com/me/my.git or whatever. We tend to call this origin, as that's the first standard remote.

We already have three or four bits of terminology:

  • branch name: this is a name for a commit. Each repository, wherever it is stored, has its own branch names. A branch name in your repository on your laptop has one special feature, which is that when you use it with git checkout or git switch, you wind up on the branch. What this means is that as you make new commits, Git stores, into this branch-name, the new commit hash ID. The new commit links backwards, in the usual way that commits do, to what was the latest commit, before you just made a new one.

  • upstream: this is a string associated with a branch name in our (laptop) repository. It's actually a two-part string—we'll see more on this later—but we usually see it as a simple string, like origin/main for main, origin/master for master, and origin/feature/tall for feature/tall.

  • remote: This is simply a name for a URL. The word origin, which appears in our upstream settings for our branches, is the first standard remote. The URL stored under origin is typically that for our copy of some Git repository (the one we own or maintain directly), which is typically on some hosting site these days (GitHub, GitLab, Bitbucket, whatever).

  • pull request or for GitLab, merge request: a hosted-site method by which someone can ask us, as the owner or maintainer of a repository, to put new commits into our hosted repository. The details here vary depending on the hosting site. Because you're using GitHub, I will use GitHub in the examples here.

Now we get to add more terminology. Because people contributing to open-source software tend not to have push access to the maintainer's repository, they will use GitHub forks. A GitHub fork is a server-side git clone operation that secretly1 saves a lot of disk space on GitHub. Meanwhile, using a fork gives you, as the user of the fork, a feature that you can't get otherwise. Note that the pronoun you here actually refers to some other person; let's say it's a "he" and call him Fred. You are the maintainer, so Fred creates a fork of your GitHub-side repository—the one you call origin—and Fred then uses his fork to do his work.

Eventually, Fred pushes some new commit(s) to Fred's GitHub repository, ssh://[email protected]/fred/repo.git or whatever. You now want to get these commits, so that you can inspect them closely.2

You need your laptop Git to be able to refer to Fred's fork. This isn't origin—that's your fork—so you need to use another URL. You can, as you have been doing, type in the raw URL each time: https://github.com/fred/repo.git or whatever.

Don't do that! It's a pain. You'll make typos. Well, maybe do it if it's a one-time thing, but if you find that you have to refer to Fred's fork a lot, create another remote.

Now, GitHub encourage us, for some reason,3 to use the name upstream as the second standard remote name here. That word is already a bit of terminology. You can use it, but I suggest that you don't, as it just gets confusing. Since we're talking about Fred here, let's use the name fred for this remote:

git remote add fred https://github.com/fred/repo.git

You can now use fred anywhere you were using https://github.com/fred/repo.git. That may be encouragement enough—but this also solves your issue.


1This isn't really secret, but you don't have to know about it.

2You could use GitHub's inspection options, but sometimes they're inadequate. See any of Linus Torvald's complaints, for instance. Sometimes they are good enough, and in those cases, using them really simplifies your job, so consider using them. But here we won't do that.

3The reason is mostly that the "us" GitHub is trying to get through to are the amateurs who are making forks and contributing just one thing back. Unfortunately, by using the word upstream here, they're doing us a big disservice.


Using multiple remotes allows setting upstreams

The reason that:

git branch -u https://github.com/USER/REPO.git/master

failed is that the upstream of a branch is made up of two parts:

  • First, there's the remote part. This has to be an actual remote.
  • After that there's a branch part. This is the name of the branch as seen on that remote Git.

The two parts are separated by a slash. Putting these two parts together gives us something that looks like a remote-tracking name. Since we tend to copy their branches (Fred's feature/tall) as our remote-tracking names of the same name (fred/feature/tall), we'll be able to do:

git branch --set-upstream-to fred-pr-123 fred/feature/tall

for instance.

A bit of craziness in upstreams

Here there is a kink in the system, though. Fortunately you mostly don't need to care about it. Unfortunately, GitHub's naming scheme for Pull Requests may make you care about it, depending on how much you care to care about it. (Ahem.) Anyway: the branch's name as seen on the upstream isn't required to match the remote-tracking name you use.

What does all this mean? Well, we have to go to the mechanism here. When you run:

git fetch fred

your Git will read your repository's configuration to find the remote.fred.fetch settings. These control what gets fetched, and the standard setting is:

 refs/heads/*:refs/remotes/fred/*

This is a refspec. It starts with a leading plus sign , which means force: your remote-tracking names will be forcibly updated, so that if Fred rebased one of his branches, you'll pick up the rebase, for instance. Then it lists refs/heads/* as the source part of the refspec, telling your Git get all of his branches. It lists refs/remotes/fred/* as the destination part of the refspec, telling your Git change those branch names into my remote-tracking names. This adds fred/ in front of each name, and also puts them in refs/remotes/, so that they are remote-tracking names and are associated with your remote fred and not any other remote. This way you can add barney, wilma, and betty as three more remotes, if that's appropriate, and you'll keep all their branches distinct in your remote-tracking names.

So that's the normal setup. But the remote.fred.fetch refspec(s) do not have to be normal. We can, if we like, write:

 refs/heads/master:refs/remotes/fred/scooby
 refs/heads/hairy:refs/remotes/fred/shaggy

This is a "two-branch remote" (analogous to a single-branch remote, but listing two branches): we copy Fred's master to our remote-tracking name fred/scooby, and his hairy to our remote-tracking name fred/shaggy. Why? Just for illustration, really. The upstream setting to refer to Fred's hairy is fred/shaggy, but it's made up of two parts: fred and hairy.

If you use git branch --set-upstream-to, you don't have to think about all this mapping forwards and backwards. If you use git config or git config --edit to reach directly into your .git/config file to fiddle with upstream settings, you will see all this craziness. If you keep the fetch refspec simple, the craziness disappears, though.

Why I covered the craziness at all

If you stick with branch@{upstream} to refer to the upstream of the given branch, and use git branch --set-upstream-to to set the upstream, you hide any craziness. If you didn't make any craziness, there isn't any craziness to hide. But here's the hitch: when someone sends you a pull request via GitHub, this appears in your repository under a ref built using PR number:

refs/pull/123/head

This is not a branch name. It's not a remote-tracking name. It's a name GitHub made up. If you want to get this commit from your GitHub fork, rather than from Fred's, you can do that:

git fetch origin refs/pull/123/head:refs/heads/pr-123

This will create, in your local repository, a branch named pr-123 (refs/heads/pr-123) that names the commit that Fred wants you to put in. That commit might have the name refs/heads/feature/tall in Fred's fork, but it has the name refs/pull/123/head in your fork.

Some people like to automatically create pr/number branches in their own repositories based on PRs made to their GitHub fork, and you can do that easily by adding another fetch refspec:

 refs/pull/*/head:refs/heads/pr-*

Your Git will take your GitHub fork's refs/pull/*/head, match the * part up to the PR number, and copy that to your own repository as refs/heads/pr-whatever.

(I don't like to do this automatically myself as it can create a lot of branches. Note that old versions of Git can't handle pr-* here and require instead pr/* or similar. Either way you'll probably want to set fetch.prune to true too, if you haven't already done that.)

If you do do this, you're letting some of the craziness show through. You literally can't push to the refs/pull namespace, so it isn't a big problem, but you'll see weirdness. Because of this dual mapping trick, this also means you can't name your own fork's branches pr-*: try it, and the craziness shines blindingly bright. (The mappings stop being one-to-one onto / bijective / invertible, but they need to be.)

Multiple remotes and triangular workflows

If you choose to fetch from Fred's fork directly, and you have multiple remotes now (fred and origin), you will:

  • fetch from Fred's fork;
  • maybe do something to adjust the commits; and
  • push to your fork.

This is, by definition, what Git calls a triangular workflow. To make it easier, Git allows you to set separate fetch and push remotes. You can do this overall, or per-branch.

Note that you may lose some features. For instance, if there's both an origin/feature/tall and a fred/feature/tall and you have no feature/tall branch in your own repository yet, and you run:

git checkout feature/tall

you might expect the DWIM feature to kick in here: your Git will notice that you don't have a feature/tall so it will create it from origin/feature/tall, with the upstream pre-set. But this doesn't work when there are two matches, and now there are two matches.

Modern Git has a new workaround, the preferred remote setting in checkout.defaultRemote. If your Git is older, you can upgrade, or you can use git checkout -t or git switch -t:

git checkout -t origin/feature/tall

will create feature/tall from origin/feature/tall by stripping the remote part from the upstream. The upstream is set from the name you listed.

You may also find that git push's push.default default of simple becomes constraining. The current setting may be more suitable for you. This is all a matter of taste anyway though.

  • Related