#include <stdio.h>
int add(int a, int b)
{
if (a > b)
return a * b;
}
int main(void)
{
printf("%d", add(3, 7));
return 0;
}
Output:
3
In the above code, I am calling the function inside the print. In the function, the if
condition is not true, so it won't execute. Then why I am getting 3 as output? I tried changing the first parameter to some other value, but it's printing the same when the if
condition is not satisfied.
CodePudding user response:
What happens here is called undefined behaviour.
When (a <= b)
, you don't return any value (and your compiler probably told you so). But if you use the return value of the function anyway, even if the function doesn't return anything, that value is garbage. In your case it is 3, but with another compiler or with other compiler flags it could be something else.
If your compiler didn't warn you, add the corresponding compiler flags. If your compiler is gcc
or clang
, use the -Wall
compiler flags.
CodePudding user response:
Jabberwocky is right: this is undefined behavior. You should turn your compiler warnings on and listen to them.
However, I think it can still be interesting to see what the compiler was thinking. And we have a tool to do just that: Godbolt Compiler Explorer.
We can plug your C program into Godbolt and see what assembly instructions it outputs. Here's the direct Godbolt link, and here's the assembly that it produces.
add:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
mov eax, DWORD PTR [rbp-4]
cmp eax, DWORD PTR [rbp-8]
jle .L2
mov eax, DWORD PTR [rbp-4]
imul eax, DWORD PTR [rbp-8]
jmp .L1
.L2:
.L1:
pop rbp
ret
.LC0:
.string "%d"
main:
push rbp
mov rbp, rsp
mov esi, 7
mov edi, 3
call add
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
pop rbp
ret
Again, to be perfectly clear, what you've done is undefined behavior. With different compiler flags or a different compiler version or even just a compiler that happens to feel like doing things differently on a particular day, you will get different behavior. What I'm studying here is the assembly output by gcc 12.2 on Godbolt with optimizations disabled, and I am not representing this as standard or well-defined behavior.
This engine is using the System V AMD64 calling convention, common on Linux machines. In System V, the first two integer or pointer arguments are passed in the rdi
and rsi
registers, and integer values are returned in rax
. Since everything we work with here is either an int
or a char*
, this is good enough for us. Note that the compiler seems to have been smart enough to figure out that it only needs edi
, esi
, and eax
, the lower half-words of each of these registers, so I'll start using edi
, esi
, and eax
from this point on.
Our main function works fine. It does everything we'd expect. Our two function calls are here.
mov esi, 7
mov edi, 3
call add
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
To call add
, we put 3
in the edi
register and 7
in the esi
register and then we make the call. We get the return value back from add
in eax
, and we move it to esi
(since it will be the second argument to printf
). We put the address of the static memory containing "%d"
in edi
(the first argument), and then we call printf
. This is all normal. main
knows that add
was declared to return an integer, so it has the right to assume that, after calling add
, there will be something useful in eax
.
Now let's look at add
.
add:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
mov eax, DWORD PTR [rbp-4]
cmp eax, DWORD PTR [rbp-8]
jle .L2
mov eax, DWORD PTR [rbp-4]
imul eax, DWORD PTR [rbp-8]
jmp .L1
.L2:
.L1:
pop rbp
ret
The rbp
and rsp
shenanigans are standard function call fare and aren't specific to add
. First, we load our two arguments onto the call stack as local variables. Now here's where the undefined behavior comes in. Remember that I said eax
is the return value of our function. Whatever happens to be in eax
when the function returns is the returned value.
We want to compare a
and b
. To do that, we need a
to be in a register (lots of assembly instructions require their left-hand argument to be a register, while the right-hand can be a register, reference, immediate, or just about anything). So we load a
into eax
. Then we compare the value in eax
to the value b
on the call stack. If a > b
, then the jle
does nothing. We go down to the next two lines, which are the inside of your if
statement. They correctly set eax
and return a value.
However, if a <= b
, then the jle
instruction jumps to the end of the function without doing anything else to eax
. Since the last thing in eax
happened to be a
(because we happened to use eax
as our comparison register in cmp
), that's what gets returned from our function.
But this really is just random. It's what the compiler happened to have put in that register previously. If I turn optimizations up (with -O3
), then gcc inlines the whole function call and ends up printing out 0
rather than a
. I don't know exactly what sequence of optimizations led to this conclusion, but since they started out by hinging on undefined behavior, the compiler is free to make what assumptions it chooses.