I've been trying to learn more about git recently. I wanted to know what a remote is and I found answers to that, simple enough, then I ran git remote -v
and there are two with identical urls but one is followed by (fetch) and the other (pull). So
origin http://someurl_git/application (fetch)
origin http://someurl_git/application (push)
I'm simply wondering how it got in that state? From what I read in another thread it's not normal to have two identical urls. I'd like to try and trace back and figure out how it got that way. And what issues this could cause.
Then I ran:
git remote show origin
And it listed this:
* remote origin
Fetch: origin http://someurl_git/application
Push: origin http://someurl_git/application
HEAD branch: master
Remote branches:
BranchA tracked
develop tracked
BranchB tracked
Local branches configured for 'git pull':
develop merges with remote develop
master merges with remote master
Local refs configured for 'git push':
develop pushes to develop (local out of date)
master pushes to master (up to date)
My questions are:
Why is there a Fetch and a Push above with the same url?
What is the correlation between 'refs' and 'branches'? Does ref just mean the remote branch that my push will affect? Whereas branch in the local context just means... 'the local branch a pull will affect' ?
And what does " (local out of date)" and ("up to date") mean? Does that just mean any changes in my local repo that are not reflected in the remote repo? But why does it say "up to date" for pushes to master? I would think it'd say the same "remote/local out of date" also.
I feel like I'm missing some terminology but if that's all that's ok, I just want to avoid any issues as more developers are involved now.
Thanks
CodePudding user response:
1. Why is there a Fetch and a Push above with the same url?
Because in some settings one could need to pull from one source but push to another. If it's not the case (by default) then both feature the same url. That's indeed normal.
2.a What is the correlation between 'refs' and 'branches'? Does ref just mean the remote branch that my push will affect? Whereas branch in the local context just means... 'the local branch a pull will affect' ?
Refs are pointers to commits. There are two types of refs : branches and tags.
3. And what does " (local out of date)" and ("up to date") mean? Does that just mean any changes in my local repo that are not reflected in the remote repo? But why does it say "up to date" for pushes to master? I would think it'd say the same "remote/local out of date" also.
"(local out of date)" means that the remote branch has new commits still unknown to your local branch, whereas "(up to date)" means both point at the same commit.
CodePudding user response:
Romain Valeri's answer covers your three questions directly, but it sounds like you need some more background information.
... I wanted to know what a remote is ...
I like to say that it's mainly a short name for a URL. That glosses over a lot of details, though. It is a name—how long is up to you; make it short so that it's easy to type—and it should store at least one URL. It can store two URLs quite easily, though: one is the "fetch" URL, and the other is the "push" URL. You set these with git remote set-url
or using git config
. Each remote's URL is stored under a unique key. If the remote is named origin
—as the first one normally is—this key is:
remote.origin.url
for the fetch URL, and:
remote.origin.pushurl
for the push URL. If the push URL is not set—if there is no remote.origin.pushurl
key—the push URL is automatically the same as the fetch URL.
Configuration keys go in files like .git/config
and $HOME/.gitconfig
or $HOME/.config/git
or similar (the precise location of configuration files is system-dependent). They have this dot-separated form when we use the git config
command to query or set them, and for most other uses, but inside the configuration files, they are stored in a modified INI file format. The modified format that Git uses shows up like this:
[remote "origin"]
url = ssh://[email protected]/user/repo.git
fetch = refs/heads/*:refs/remotes/origin/*
for instance. This pair of entries under the [remote "origin"]
section header defines remote.origin.url
and remote.origin.fetch
.
More about config files
Some key-value pairs in Git are single-valued. For instance, the user.name
and user.email
settings work this way. When you run git commit
to make a new commit, Git runs the internal equivalent of git config --get user.name
to get your name.
Suppose your global configuration file says:
[user]
name = Fred Flintstone
email = [email protected]
but your local (in-Git-repository) .git/config
file says:
[user]
name = Barney Rubble
(with no setting for user.email
). The result of running:
git config --get user.name
will be Barney Rubble
, because the local setting, which is set here, overrides the global setting. The result of running:
git config --get user.email
will be [email protected]
because there's no local setting, so the global setting is the only one available.
Other key-value pairs can be multi-valued. In this case, each setting is visible, but the Git program that's using these settings must run the internal equivalent of git config --get-all
rather htan git config --get
. With the above configuration files, running:
git config --get-all user.name
would spill out two lines:
Fred Flintstone
Barney Rubble
Without looking at each individual program, there is no way to know whether it will use git config --get
or git config --get-all
(or the internal equivalents). So you never know, if you set a configuration item more than once, which commands might use all the settings, and which ones will use only the last setting, unless they tell you in their documentation. Sometimes you can find out by experiment. You also cannot tell, without documentation, what settings might be used now or in the future. For instance, you can run:
git config --global french.city paris
right now. Nothing in Git uses this setting today so the fact that you've added:
[french]
city = paris
to your global Git configuration affects no Git program today. Should some Git program start using that setting next year, however, well, you've set it, so it will do whatever that program means for it to do in 2023.
The fact that these settings are free-form makes it ridiculously easy to introduce typos and never know you have done so. Always check settings carefully!
The git remote
program
The git remote
program is, like a lot of Git commands, rather complicated. It started out simple, but it grew over time. It can:
- show, add, rename, or delete some remote;
- show or set the URL(s) for a remote;
- show or set the
fetch
line(s) for a remote, which are multi-valued; - run
git fetch
to a single remote or multiple remotes; - "prune" a remote (this is similar to
git fetch -p
but slightly different); - set or delete the "head" of any given remote; or
- "show" information about a particular remote.
Most of these operations do not call up the remote in question, i.e., they don't reach out over the network to tickle another set of Git software to look at some other Git repository. The exceptions to this rule are:
update
(fetch), which has to call up the remote to find new commits or other new objects that may exist there;prune
, which is kind of like fetching;set-head
with the--auto
option; andshow
.
The last one—git remote show
—optionally calls up the remote. The default is that it should do so, if given the name of a remote; the -n
flag prevents this.
By calling up the remote—which git ls-remote
does as well; try it sometime; note that it may produce a lot of output and you might want to pipe it through a pager—the git show
command can find out which branch names exist on the remote. This helps it figure out what git fetch
or git push
would do, if you ran git fetch
or git push
right now.
When used with -n
, the git show
command can't actually achieve that. This changes the output somewhat since it (git remote show
) knows that it (git remote show
) doesn't know what it (git fetch
or git push
) would do if run with that remote name.
Atomicity
There is, however, something I consider a fairly big problem with git remote show
without -n
. Suppose you have cloned a highly active repository (such as the Git repository for Git or for the Linux kernel). You're curious what git fetch origin
might bring over, if you run it now, so you run:
git remote show origin
Your Git calls up another Git over at GitHub or kernel.org
or wherever, has a fairly long conversation with that other software, and eventually spills out some statistics. This takes many seconds, and then you read through those same statistics also for many seconds—or maybe you come back after a lunch break to read them, or something.
You then decide ah, I want _____ (fill in some blank here), so you run git fetch origin ______
... and you get something very different from what git show
showed you because in that long delay between the time you looked to see what you would get, and the time you actually ran git fetch
to get things, everything changed.
This doesn't mean that the act of looking first is necessarily bad. It works fine for highly-inactive repositories, for instance. It's just not guaranteed to reflect the future reality. If you want to know what will happen, you have to do the thing—whatever it is—and then see what did happen. In the case of Git, that means running git fetch
.
This sort of thing is one of many reasons I dislike git pull
. With git pull
, you run git fetch
and then without looking to see what did happen you make the assumption that something you expected to happen did happen, and you run a second Git command, git merge
or git rebase
. That second command works with what did happen. It does not work with what you expected unless you got lucky and got exactly what you expected.
If you know a lot about the repository from which you're going to fetch-and-<second-command>, that's reasonable enough. But if you don't know that much, instead of using git show
and then git pull
, I personally prefer to run git fetch
, then see what I got, and only then run a second command (or not). But this does vary, case-by-case, from one Git repository to another (and also based on how long it's been since I poked around with it).
The general idea here is atomicity. Some operations, like git fetch
or git push
, are "atomic": they cannot be broken into sub-operations, at least not without radically changing things. Other operations, like git pull
, can. Using git remote show remote
is, in effect, an attempt to go below the level of atomicity actually offered by Git. So I avoid that, and just run git fetch
, which is the level at which Git really is atomic. (See also https://en.wikipedia.org/wiki/ACID)