I'm planning to participate to some of the Capture the flags (CTF) challenges, in the near future. For that reason, I've decided to study assembly. As of now I'm focusing on the usage of the CPU registers. Following some examples that I have found on internet, I tried to debug a very simple "Hello World" program written in C, to see how the CPU registers are used. My environment is Linux and GCC version 11. I compiled my code with the -g flag, in order to include debug symbols.
Following is my very simple C source code:
#include <iostream>
int main (int argc, char** argv)
{
char message_c_str[] = "Hello World from C!";
printf("%s\n", message_c_str);
return 0;
}
Studying the disassembly of the main function, I understand that the string containing the message gets stored inside the RAX (and RDX registers?), before calling the printf function:
└─$ objdump -M intel -D main| grep -A20 main.:
0000000000001159 <main>:
1159: 55 push rbp
115a: 48 89 e5 mov rbp,rsp
115d: 48 83 ec 30 sub rsp,0x30
1161: 89 7d dc mov DWORD PTR [rbp-0x24],edi
1164: 48 89 75 d0 mov QWORD PTR [rbp-0x30],rsi
1168: 48 b8 48 65 6c 6c 6f movabs rax,0x6f57206f6c6c6548
116f: 20 57 6f
1172: 48 ba 72 6c 64 20 66 movabs rdx,0x6d6f726620646c72
1179: 72 6f 6d
117c: 48 89 45 e0 mov QWORD PTR [rbp-0x20],rax
1180: 48 89 55 e8 mov QWORD PTR [rbp-0x18],rdx
1184: c7 45 f0 20 43 21 00 mov DWORD PTR [rbp-0x10],0x214320
118b: 48 8d 45 e0 lea rax,[rbp-0x20]
118f: 48 89 c7 mov rdi,rax
1192: e8 b9 fe ff ff call 1050 <puts@plt>
1197: b8 00 00 00 00 mov eax,0x0
119c: c9 leave
119d: c3 ret
I thought to start a debug session and try to change the RAX on the fly, just for the sake of seeing if I was able to change the string content before printing it on the command line. Unfortunately, even though it seems that I can change the RAX value, the program still prints the hard coded message. So, I'm not sure why I cannot change it. Am I missing to run any gdb command after updating the value of RAX?
Following is my debug session with the issue:
┌──(alexis㉿kali)-[~/Desktop/Hacking/hello_world]
└─$ gdb -q main
Reading symbols from main...
(gdb) break main
Breakpoint 1 at 0x1168: file /home/alexis/Desktop/Hacking/hello_world/main.cpp, line 5.
(gdb) run
Starting program: /home/alexis/Desktop/Hacking/hello_world/main
Breakpoint 1, main (argc=1, argv=0x7fffffffdf58) at
/home/alexis/Desktop/Hacking/hello_world/main.cpp:5
5 char message_c_str[] = "Hello World from C!";
(gdb) info register rax
rax 0x555555555159 93824992235865
(gdb) next
6 printf("%s\n", message_c_str);
(gdb) info register rax
rax 0x6f57206f6c6c6548 8022916924116329800
(gdb) set $rax=0x6361636361
(gdb) info register rax
rax 0x6361636361 426835665761
(gdb) next
Hello World from C!
8 return 0;
(gdb)
You can see that the code still prints "Hello World from C!", even if the RAX register changed. Why?
CodePudding user response:
The string is only temporarily in rax
rdx
. In the following lines it is placed on the stack and the address goes to rdi
, that's used by puts
.
What's important here to understand is that on line of source code is translated to multiple lines of assembly. When you change the rax
on line printf("%s\n", message_c_str);
the string it's already pushed on the stack and rax
only keeps an old value as it wasn't overwritten by anything. It is no longer the string that's being printed.
To accomplish your goal you would have to change the string on the stack or change it in rax
before it's being pushed onto it (so before your next
command).
Also be aware that next
advances one source code line, if you want to move assembly instruction use nexti
- with that you have more control what's get's executed.
CodePudding user response:
You're using next
(whole block of asm corresponding to a C source line), not nexti
or stepi
(aka ni
or si
) to step by asm instruction.
And you made a debug build so GCC doesn't keep anything in registers across C statements. The points where execution stops with next
are the ones where the compiler-generated instructions are about to load or LEA a new RAX, so its current value is dead and doesn't matter.
(And it's only using RAX at all because it's a debug build with GCC; otherwise things like lea rax,[rbp-0x20]
/ mov rdi,rax
would LEA straight into RDI, instead of uselessly using RAX as a temporary. Return value from writing an unused parameter when falling off the end of a non-void function Or for mov-immediate to memory, there's no mov r/m64, imm64, only to register, so those moves to RAX and RDX do make sense.)
If you wanted to have it print something different, you could si
until after movabs rax,0x6f57206f6c6c6548
but before mov QWORD PTR [rbp-0x20],rax
, and at that point change the initializer for part of the string data. (Which is in RAX at that point.) e.g. introducing a 0x00
byte will terminate the C string.
Or right before the call puts
, you could set $rdi = $rdi 5
to be like puts(message_c_str 5)
.
layout reg
or layout asm
(use layout next
/ prev
to fix the display if its broken) are helpful for seeing where execution is. See other GDB asm tips at the bottom of https://stackoverflow.com/tags/x86/info