Home > other >  Linux `getcontext` returns incorrect stack state
Linux `getcontext` returns incorrect stack state

Time:10-11

I'm trying to use Linux' getcontext syscall to get the current context, particularly the current stack. I understand that there could be different contexts particularly when signal handlers are concerned. However, in a simple program with main and a few functions, what does the ucontext_t's stack point to?

Here is a simple example I ran:

#define _GNU_SOURCE
#include <inttypes.h>
#include <sys/types.h>
#include <unistd.h>
#include <ucontext.h>

// Unnecessary
#include <linux/limits.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>


void print_ucstack(stack_t *ustk) {
  printf("ss_sp = %p, ss_size = 0x%lx, ss_flags = %x\n", ustk->ss_sp, ustk->ss_size, ustk->ss_flags);
}

int print_context() {

  ucontext_t ctx;
  if(getcontext(&ctx) != 0)
    return -1;

  greg_t *regs = ctx.uc_mcontext.gregs;
  printf("Actual stack pointer = %llx\n", regs[REG_RSP]);
  print_ucstack(&ctx.uc_stack);

  return 0;
}

void print_map() {
  char filebuf[4096];
  int fd = open("/proc/self/maps", O_RDONLY);
  int nbytes;
  if(fd != -1) {
    while((nbytes = read(fd, filebuf, sizeof(filebuf))) != 0) 
      printf("%.*s", nbytes, filebuf);
  }
  close(fd);
}

int main() {
  print_map();
  print_context();
}

In the above example, I print /proc/self/maps to get the correct process layout. Also, I print the real stack pointer from uc_mcontext. Finally, I print the stack pointer from uc_stack.ss_sp.

Here is an example output:

558e93a85000-558e93a86000 r--p 00000000 103:06 1449251                   /tmp/stackrelocate/stackrelocate
558e93a86000-558e93a87000 r-xp 00001000 103:06 1449251                   /tmp/stackrelocate/stackrelocate
558e93a87000-558e93a88000 r--p 00002000 103:06 1449251                   /tmp/stackrelocate/stackrelocate
558e93a88000-558e93a89000 r--p 00002000 103:06 1449251                   /tmp/stackrelocate/stackrelocate
558e93a89000-558e93a8a000 rw-p 00003000 103:06 1449251                   /tmp/stackrelocate/stackrelocate
7fcb74663000-7fcb74685000 r--p 00000000 103:06 1311717                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fcb74685000-7fcb747fd000 r-xp 00022000 103:06 1311717                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fcb747fd000-7fcb7484b000 r--p 0019a000 103:06 1311717                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fcb7484b000-7fcb7484f000 r--p 001e7000 103:06 1311717                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fcb7484f000-7fcb74851000 rw-p 001eb000 103:06 1311717                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7fcb74851000-7fcb74857000 rw-p 00000000 00:00 0 
7fcb7487c000-7fcb7487d000 r--p 00000000 103:06 1311305                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fcb7487d000-7fcb748a0000 r-xp 00001000 103:06 1311305                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fcb748a0000-7fcb748a8000 r--p 00024000 103:06 1311305                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fcb748a9000-7fcb748aa000 r--p 0002c000 103:06 1311305                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fcb748aa000-7fcb748ab000 rw-p 0002d000 103:06 1311305                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7fcb748ab000-7fcb748ac000 rw-p 00000000 00:00 0 
7ffdb8771000-7ffdb8792000 rw-p 00000000 00:00 0                          [stack]
7ffdb87fb000-7ffdb87fe000 r--p 00000000 00:00 0                          [vvar]
7ffdb87fe000-7ffdb87ff000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]
Actual stack pointer = 7ffdb8790260
ss_sp = 0x7fcb746682d0, ss_size = 0x7ffdb87904b0, ss_flags = 74855000

As you can see, the actual stack pointer is within the stack VMA, whereas the uc_stack.ss_sp is in the executable VMA from libc.so. The size value is also incorrect.

What is happening here?

  • Why is the uc_stack member not reflecting the correct values?
  • What is the values in uc_stack correspond to?
  • How can I get the correct limits (base, size) of the stack region with syscalls, without reading /proc/self/maps?

CodePudding user response:

  • Why is the uc_stack member not reflecting the correct values?

Because getcontext does not initialize uc_stack. It only initializes the bare minimum state required for setcontext later. If I'm reading the various implementations correctly (getcontext has to be coded in assembly language for each CPU architecture and ABI supported by the C library) they save only the registers defined by the ABI as call-preserved, plus the signal mask.

uc_stack exists for you to initialize, in between a call to getcontext and a call to makecontext.

You can confirm this for yourself by adding memset(&ctx, 0, sizeof ctx); right before the getcontext call, and then observing that uc_stack.* are zero afterward.

  • What is the values in uc_stack correspond to?

They are garbage.

  • How can I get the correct limits (base, size) of the stack region with syscalls, without reading /proc/self/maps?

With actual system calls,1 you can't. There is no good way to do this on Linux currently. See the discussion starting at https://sourceware.org/pipermail/libc-alpha/2022-September/141932.html -- glibc has the same problem internally.

With (extended) pthread library routines, you can do this:

int get_initial_stack_region(void **p_stackaddr, size_t *p_stacksize)
{
    int perr;
    pthread_attr_t attr;

    if ((perr = pthread_getattr_np(pthread_self(), &attr)) != 0) {
        fprintf(stderr, "pthread_getattr_np: %s\n", strerror(perr));
        return -1;
    }
    if ((perr = pthread_attr_getstack(&attr, p_stackaddr, p_stacksize)) != 0) {
        fprintf(stderr, "pthread_attr_getstack: %s\n", strerror(perr));
        return -1;
    }
    return 0;
}

This is officially supported, but right now, under the hood, what it's doing is reading and parsing /proc/self/maps!


1 getcontext etc are not system calls either, they are library routines. System calls are narrowly defined: the set of operations that is implemented by trapping into the OS kernel. This command will print an exhaustive list of system calls:

printf '#include <sys/syscall.h>\n' | 
    gcc -E -dM -xc - |
    sed -ne 's/^#define SYS_\([^ ]*\) .*$/\1/p' |
    sort

Not all of these correspond directly to functions in the C library, for various reasons including

  • it's obsolete but preserved for backward compatibility with old binaries
  • the C library wrapper function exists but it has a different name
  • nobody has gotten around to adding the C library wrapper function yet
  • it's not actually possible to use that syscall safely unless you are the C library
  • Related