I've been looking into building cross-toolchains and have a general question about the compilation and workings of gcc.
The question is about this excerpt from the official gcc documentation:
In order to build GCC, the C standard library and headers must be present for all target variants for which target libraries will be built (and not only the variant of the host C compiler).
Why is the target's standard library required to build the (cross) compiler itself? Shouldn't the (cross) compiler running on the host only require the host's standard library to be built and then be able to compile the target's standard library?
I also found this on crosstool-NG's how a toolchain is constructed:
the final compiler needs the C library, to know how to use it, but: building the C library requires a compiler
This is consistent with what's stated above but I don't get why the final compiler needs to be built against a prebuilt target C library just to know how to use it later on. What is there to know for the host compiler about the target C library? Isn't it the linker's job to link target programs against the target's standard library at compile time?
CodePudding user response:
Because that's the only way to ensure a working compiler for the target platform is created. There's no point in creating a non-working compiler, distributing it to the target platform, and finding out then that it's useless.
In general, a non-shared-object executable file is only successfully created if there are no unresolved symbols.
Per the GCC 11.2 "Overall Options" documentation
Compilation can involve up to four stages: preprocessing, compilation proper, assembly and linking, always in that order. GCC is capable of preprocessing and compiling several files either into several assembler input files, or into one assembler input file; then each assembler input file produces an object file, and linking combines all the object files (those newly compiled, and those specified as input) into an executable file.
So the final step is linking. The GNU linker 'ld' man page states:
Normally the linker will generate an error message for each reported unresolved symbol but the option
--warn-unresolved-symbols
can change this to a warning.
and
--error-unresolved-symbols This restores the linker's default behaviour of generating errors when it is reporting unresolved symbols.
So, by default linking fails when there are unresolved symbols.
Why?
Because if there are unresolved symbols, the resulting executable file won't work when it's run.
And the only way to ensure there are no unresolved symbols is to have all the necessary libraries from the target platform available when cross-compiling so all symbols can be resolved when the new compiler executables are being linked.
CodePudding user response:
I do not know the full answer but it is sure it needs the C standard library of the target in order to take correct decisions about how to link the program with the target standard library.
Implementing target C library is different from implementation to implementation and in order to be able to correctly link with, it must know how was implemented the interface.
The target C library may be implemented in any language, usually in assembler or C itself. But you need to know informations about it.
There may also be other reasons I did not think of.