Home > other >  Solved: How linker allow multiple definitions of a function template in different object files but o
Solved: How linker allow multiple definitions of a function template in different object files but o

Time:03-22

I know how to use inline keyword to avoid 'multiple definition' while using C template. However, what I am curious is that how linker is distinguishing which specialization is full specialization and violating ODR and reporting error, while another specialization is implicit and correctly handle it?

From the nm output, we can see duplicated definitions in main.o and other.o for both int-version max() and char-version max(), but C linker only reports 'multiple definition error for char-version max()' but let 'char-version max() go a successful link? How linker differentiate them and does this?

// tmplhdr.hpp
#include <iostream>

// this function is instantiated in main.o and other.o
// but leads no 'multiple definition' error by linker
template<typename T>
T max(T a, T b)
{
    std::cout << "match generic\n";
    return (b<a)?a:b;
}

// 'multiple definition' link error if without inline
template<>
inline char max(char a, char b)
{
    std::cout << "match full specialization\n";
    return (b<a)?a:b;
}
// main.cpp
#include "tmplhdr.hpp"

extern int mymax(int, int);

int main()
{
    std::cout << max(1,2) << std::endl;
    std::cout << mymax(10,20) << std::endl;
    std::cout << max('a','b') << std::endl;
    return 0;
}
// other.cpp
#include "tmplhdr.hpp"

int mymax(int a, int b)
{
    return max(a, b);
}

Test output on Ubuntu is reasonable; but output on Cygwin is rather strange and confusing...

==== Test on Cygwin ====

g linker only reported 'char max(char, char)' is duplicated.

$ g   -o main.exe main.cpp other.cpp
/usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: 
/tmp/ccYivs3O.o:other.cpp:(.text$_Z3maxIcET_S0_S0_[_Z3maxIcET_S0_S0_] 0x0): 
multiple definition of `char max<char>(char, char)'; 
/tmp/cc7HJqbS.o:main.cpp:(.text 0x0): first defined here
collect2: error: ld returned 1 exit status

I dumped my .o object file and found no many clues (maybe I am not quite familiar with object format spec.).

$ nm main.o | grep max | c  filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <<-- implicit specialization
                 U mymax(int, int)
$ nm other.o | grep max | c  filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
000000000000009b t _GLOBAL__sub_I__Z5mymaxii
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <-- implicit specialization
0000000000000000 T mymax(int, int)

==== Test on Ubuntu ====

This is what I have got on my Ubuntu with g -9 after having remove inline from tmplhdr.hpp

tony@Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ g   -o main main.o other.o
/usr/bin/ld: other.o: in function `char max<char>(char, char)':
other.cpp:(.text 0x0): multiple definition of `char max<char>(char, char)'; main.o:main.cpp:(.text 0x0): first defined here
collect2: error: ld returned 1 exit status

'char-version max()' is marked with T which is not allowed to have multiple definitions; but 'in-version max()' is marked as W which allows multiple definitions. However, I start to be curious why nm gives different marks on Cygwin than on Ubuntu?? and Why linker on Cgywin can handle two T definitions correctly?

tony@Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm main.o | grep max | c  filt
0000000000000133 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
                 U mymax(int, int)
tony@Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm other.o | grep max | c  filt
00000000000000d7 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
000000000000003e T mymax(int, int)

CodePudding user response:

However, I start to be curious why nm gives different marks on Cygwin than on Ubuntu?? and Why linker on Cgywin can handle two T definitions correctly?

You need to understand that the nm output does not give you the full picture.

nm is part of binutils, and uses libbfd. The way this works is that various object file formats are parsed into libbfd-internal representation, and then tools like nm print that internal representation in human-readable format.

Some things get "lost in translation". This is the reason you should ~never use e.g. objdump to look at ELF files (at least not at the symbol table of the ELF files).

As you correctly deduced, the reason multiple max<int>() symbols are allowed on Linux is that the compiler emits them as a W (weakly defined) symbol.

The same is true for Windows, except Windows uses older COFF format, which doesn't have weak symbols. Instead, the symbol is emitted into a special .linkonce.$name section, and the linker knows that it can select any such section into the link, but should only do that once (i.e. it knows to discard all other duplicates of that section in any other object file).

  • Related