Home > OS >  Why are there duplicate C/C compilers (e.g g , gcc)?
Why are there duplicate C/C compilers (e.g g , gcc)?

Time:07-20

According to this answer, gcc and g have some functional differences. As confirmed by this script, the commands point to the exact same binary, effectively making them duplicates. Why is that so?

$ uname
Darwin
$ md5 `which cc c   gcc g   clang clang  `
fac4668657765c8dfe89d8995acfb5a2  /usr/bin/cc
fac4668657765c8dfe89d8995acfb5a2  /usr/bin/c  
fac4668657765c8dfe89d8995acfb5a2  /usr/bin/gcc
fac4668657765c8dfe89d8995acfb5a2  /usr/bin/g  
fac4668657765c8dfe89d8995acfb5a2  /usr/bin/clang
fac4668657765c8dfe89d8995acfb5a2  /usr/bin/clang  

CodePudding user response:

The executable can determine the name it was called with by inspecting the first (or zeroth) command line argument passed to it. By convention it is the name of the executable and is passed by whatever program is invoking the compiler (typically e.g. a shell).

Although it is the same executable, it can then take different actions based on whether or not that value is gcc or g .

Also, the files you are seeing are unlikely to be duplicate files. They are most likely just (soft or hard) links to the same file.


For the part that clang/clang and gcc/g seem to be the same, although they are completely different compilers, that is an Apple quirk. They link gcc and g to clang and clang for some reason, but in reality both refer to Apple clang, which is also different from upstream clang. It often causes confusion (at least for me).

CodePudding user response:

the commands point to the exact same binary, effectively making them duplicates

Yes, they're intended to be the same, and are actually the same single file on disk, given the inode number of them is the same

$ which cc c   gcc g   clang clang   | xargs ls -li
1152921500312779808 -rwxr-xr-x  76 root  wheel  167120 May 10 04:30 /usr/bin/c  
1152921500312779808 -rwxr-xr-x  76 root  wheel  167120 May 10 04:30 /usr/bin/cc
1152921500312779808 -rwxr-xr-x  76 root  wheel  167120 May 10 04:30 /usr/bin/clang
1152921500312779808 -rwxr-xr-x  76 root  wheel  167120 May 10 04:30 /usr/bin/clang  
1152921500312779808 -rwxr-xr-x  76 root  wheel  167120 May 10 04:30 /usr/bin/g  
1152921500312779808 -rwxr-xr-x  76 root  wheel  167120 May 10 04:30 /usr/bin/gcc

On a typical *nix system symlinks are usually used instead of hard links like that

$ ls -al /usr/bin | grep 'vim'
lrwxr-xr-x     1 root   wheel         3 May 10 04:30 ex -> vim
lrwxr-xr-x     1 root   wheel         3 May 10 04:30 rview -> vim
lrwxr-xr-x     1 root   wheel         3 May 10 04:30 rvim -> vim
lrwxr-xr-x     1 root   wheel         3 May 10 04:30 vi -> vim
lrwxr-xr-x     1 root   wheel         3 May 10 04:30 view -> vim
-rwxr-xr-x     1 root   wheel   5056496 May 10 04:30 vim
lrwxr-xr-x     1 root   wheel         3 May 10 04:30 vimdiff -> vim
-rwxr-xr-x     1 root   wheel      2154 May 10 04:30 vimtutor

That said, in any case the command can be determined easily, regardless of the full executable file, a hard link or a symlink, by checking the command executed which is argv[0] in a typical C or C program, or $0 in bash. This is extremely common and one notable usage of it is in BusyBox where almost all POSIX utilities are in a single busybox binary and anything you run like ls, mv, test, rm... will eventually run busybox


But why is gcc and clang the same binary? It's another weird thing in macOS because Apple stopped using/including gcc for more than a decade ago due to licensing issues. They lie to others by telling "clang is gcc" and the only way you can know it is by running gcc --version

  • Related