Home > database >  What calling convention does printf() in C use?
What calling convention does printf() in C use?

Time:05-14

So I've been practicing to write simple subroutines in FASMW using the CDECL and STDCALL calling conventions and it got me wondering about what the printf function in C would be using.

Also, a definition of the function in x86 32 bit Assembly would be great. If that's not too much to ask.

CodePudding user response:

Per C 2018 7.1.4 2 and 7.21.6.3 1, printf must work if a program declares it as int printf(const char * restrict format, ...);, and therefore printf must work with the default calling convention of the C implementation.

It is possible that a C implementation could provide multiple implementations of printf, so that #include <stdio.h> provides an alternate declaration of a printf function or macro that designates a second implementation of printf. However, the primary implementation described above must be provided.

Also, a definition of the function in x86 32 bit Assembly would be great.

It is unlikely you will find a high-quality modern implementation of printf in assembly language, except as obtained by compiling implementations in C and/or other languages. It is actually unlikely you will find a definition in a single routine; printf is a complicated routine with many subparts and typically has its implementation dispersed across multiple routines and source files. This answer and this one have links to some implementations. The former includes a link to the GNU C Library implementation of vprintf (or the entry point for it), a core part of printf.

CodePudding user response:

printf always uses CDECL in real-world C libraries, because STDCALL is highly inconvenient for variadic functions, and would make it impossible to work correctly in some cases ISO C requires it to.


ISO C says it's well-defined behaviour to pass extra args, like printf("%d\n", 1, 2, 3);

Printf must safely ignore them, and behave like printf("%d\n", 1);. This rules out callee-pops conventions like STDCALL. (Which would be inconvenient anyway for any variadic function because ret imm16 to increment [er]SP after popping into [er]IP is only available with an immediate operand, not register. So you'd have to pop the return address, copy it over the highest 4 or 8 bytes of args, and ret from there, even if you could accurately calculate where.)

Mainstream calling conventions don't separately pass the number of args (or their size on the stack in bytes) to variadic functions, or use any kind of sentinel value, so there's no way an implementation of printf could find out how many args were actually passed. The args have to match the format string for as many args as the format string references, but there's no requirement not to pass args beyond that.

That's why Windows C ABIs / implementations always use a convention like CDECL for variadic functions, even if they default to STDCALL for functions with fixed numbers of args. (32-bit FASTCALL is also callee-pops; Windows x64 is not, and current MS documentation sometimes calls it x64 fastcall or "a fastcall convention".)


Also, a definition of the function in x86 32 bit Assembly would be great.

I'm doing this for educational purposes.

Since you're doing this to learn about asm, not necessarily to create a C implementation, printf is a pretty complicated API to implement, and probably not a good choice for pure assembly projects. (Or arguably for any modern design that doesn't have to actually be ISO C; parsing a text format string and going through a variable-length list of arguments has major downsides for simplicity. C people argue that a separate function call for each object you want to output is much better for type safety and stuff)

It's usually easier to deal with individual type -> string functions, like a print_int (decimal) vs. print_int_hex vs. print_double (very complicated on its own actually) vs. print_c_string (0 terminated) vs. print_buffer (pointer, length).

As a toy project, don't aim too high with your I/O formatting functions.

Provide some simple usable ones at first, that are easy to call from asm. Irvine32 with its WriteDec (unsigned) vs. WriteInt (signed) vs. WriteString is one decent example of a set of output functions for toy programs. Irvine32 notably uses a custom calling convention where all registers are call-preserved (training wheels mode), and the arg is in EAX or EDX (this is very good; stack args are dumb especially for functions that only take one)

Another very similar example is the MARS system-calls for that MIPS simulator. Some of them are poorly designed (or intentionally inconvenient for students?), like its read-string not returning the length in the return-value register, just leaving the characters in the pointed-to buffer with a terminating 0 byte (as a C string). So if you want to know how many you read, you have to loop over them looking for the first 0, i.e. strlen.

These toy APIs don't have cursor-movement, input without echo, or any of the things that makes real terminal and keyboard handling way more complicated. Or any way to specify formatting like printf 0d to pad with leading zeros out to 20 digits long.

If you want to write your own input/output functions, you can think about whether you want them to be able to mix easily with code that directly uses lower-level functions, or whether they do their own buffering like C stdio, and should be treated as an opaque I/O layer so programs shouldn't use them and lower-level OS system calls at the same time.

Depending how sophisticated you want it, maybe taking args to specify width limits, or do that on a case by case basis customized for the project. (After all, if you wanted maintainability and easy code-reuse, you wouldn't choose asm in the first place. So just implement the I/O details at the place that's doing it, instead of building a flexible mechanism for callers to request any kind of formatting)

  • Related