Home > Software engineering >  Does char arrays get duplicated in C after fork()?
Does char arrays get duplicated in C after fork()?

Time:10-24

I am wondering if char arrays get duplicated after fork() is called in C. For example, in the example below is the output:

The message in child is secret message
The message in parent is secret message

or

The message in child is secret message
The message in parent is empty message
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>

int main()
{
    char msg[50] = "empty message";
    pid_t pid = fork();
    if(pid > 0) {
        wait(0);
        printf("The message in parent is %s\n", msg);
    } else {
        strcpy(msg, "secret message");
        printf("The message in child is %s\n", msg);
    }

    return 0;
}

CodePudding user response:

I want to answer you with a fun experiment. Here's a similar but easier version of your code:

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>

int main()
{
    int x = 10;
    int *px = &x;
    pid_t pid = fork();
    if(pid > 0) {
        wait(0);
        printf("x in parent is %d\n", *px);
    } else {
        *px = 20;
        printf("x in child is %d\n", *px);
    }

    return 0;
}

Where I used an integer to be modified by the child process to simplify the assembly. If we compile and stop at the generated assembly ( I did it with ARM compiler which I am more comfortable with ) we can see that the value 10 is saved on the stack at sp-16, while the address of x is saved at sp-24

        mov     w0, 10          // x = 10
        str     w0, [sp, 16]    
        add     x0, sp, 16      // px = &x
        str     x0, [sp, 24]

The in the child branch we modify the value of x by retrieving the address from the stack, and writing a new value at that address.

        ldr     x0, [sp, 24]   // *px = 20;
        mov     w1, 20
        str     w1, [x0]

        ldr     x0, [sp, 24]   // Load value of x and print it with message
        ldr     w0, [x0]
        mov     w1, w0
        adrp    x0, .LC1
        add     x0, x0, :lo12:.LC1
        bl      printf

So the value at sp-24 is now changed right? Then how is it possible that when the parent code is executed:

        ldr     x0, [sp, 24]  // Load value of x and print it with message
        ldr     w0, [x0]
        mov     w1, w0
        adrp    x0, .LC0
        add     x0, x0, :lo12:.LC0
        bl      printf
        b       .L3

We are loading from the very same address, but we still get a value of 10?

x in child is 20
x in parent is 10

The answer, as pointed out in the comments, is that each process has its own address space. The address space contains the stack, the heap, the mapped pages etc.. However, upon forks, the address spaces are exactly the same, hence why we can load x from the very same address in both cases.

  • Related