I am wondering if char arrays get duplicated after fork() is called in C. For example, in the example below is the output:
The message in child is secret message
The message in parent is secret message
or
The message in child is secret message
The message in parent is empty message
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
int main()
{
char msg[50] = "empty message";
pid_t pid = fork();
if(pid > 0) {
wait(0);
printf("The message in parent is %s\n", msg);
} else {
strcpy(msg, "secret message");
printf("The message in child is %s\n", msg);
}
return 0;
}
CodePudding user response:
I want to answer you with a fun experiment. Here's a similar but easier version of your code:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
int main()
{
int x = 10;
int *px = &x;
pid_t pid = fork();
if(pid > 0) {
wait(0);
printf("x in parent is %d\n", *px);
} else {
*px = 20;
printf("x in child is %d\n", *px);
}
return 0;
}
Where I used an integer to be modified by the child process to simplify the assembly. If we compile and stop at the generated assembly ( I did it with ARM compiler which I am more comfortable with ) we can see that the value 10 is saved on the stack at sp-16
, while the address of x is saved at sp-24
mov w0, 10 // x = 10
str w0, [sp, 16]
add x0, sp, 16 // px = &x
str x0, [sp, 24]
The in the child branch we modify the value of x
by retrieving the address from the stack, and writing a new value at that address.
ldr x0, [sp, 24] // *px = 20;
mov w1, 20
str w1, [x0]
ldr x0, [sp, 24] // Load value of x and print it with message
ldr w0, [x0]
mov w1, w0
adrp x0, .LC1
add x0, x0, :lo12:.LC1
bl printf
So the value at sp-24
is now changed right? Then how is it possible that when the parent code is executed:
ldr x0, [sp, 24] // Load value of x and print it with message
ldr w0, [x0]
mov w1, w0
adrp x0, .LC0
add x0, x0, :lo12:.LC0
bl printf
b .L3
We are loading from the very same address, but we still get a value of 10
?
x in child is 20
x in parent is 10
The answer, as pointed out in the comments, is that each process has its own address space. The address space contains the stack, the heap, the mapped pages etc.. However, upon forks, the address spaces are exactly the same, hence why we can load x
from the very same address in both cases.