Home > Blockchain >  Behaviour of segfault message depending on execution environment
Behaviour of segfault message depending on execution environment

Time:09-04

I am trying to understand why the behaviour of segfault messages can differ depending on execution environment.

I have the following C code that I am using to deliberately trigger a segfault:

#include <stdio.h>

int main() {
    int* p = NULL;
    printf("%d\n", *p);
}

On my local Linux, when I compile this code and execute the resulting binary, I see the segfault message even when I redirect both stdout and stderr to /dev/null:

$ ./segfault > /dev/null 2>&1
Segmentation fault (core dumped)
$

On a Docker container in Jenkins, I compile and run the exact same C code. I run the binary using the sh step like this:

sh './segfault > /dev/null || true'
sh './segfault > /dev/null 2>&1 || true'

Here is the output in Jenkins:

  ./segfault
Segmentation fault (core dumped)
  true
  ./segfault
  true

As you can see, the segfault message is being written to stderr in Jenkins (you can see this from how when I don't redirect to stderr, the message appears, but if I do redirect to stderr, the message does not appear). But the segfault message is not written to stderr on my local Linux.

I verified that redirecting stdout and stderr to /dev/null when running the Docker container interactively in my local Linux system also resulted in the segfault message appearing in the container's interactive shell output:

$ gcc segfault.c 
$ ./a.out >/dev/null 2>&1
Segmentation fault (core dumped)

I looked at the Java source code for the sh step, but nothing stood out to me as a cause for this different behaviour in Jenkins (but it is very possible I missed something).

My local Linux is Ubuntu 20.04. The Docker image I used on Jenkins is the gcc image. Both use x86_64 architecture.

On my local Linux, here is my kernel release and kernel version:

$ uname -rv
5.13.0-30-generic #33~20.04.1-Ubuntu SMP Mon Feb 7 14:25:10 UTC 2022

This exactly matches the kernel release and kernel version of the gcc image (at the time of writing).

My local Linux has gcc 9.4.0, and (at the time of writing) the gcc image has gcc 12.2.0.

Why is the behaviour different in Jenkins versus locally?

Is the cause one of the following? Or is it something else?

  • Difference between Docker containers and non-Docker Linux
  • Difference in gcc version I compile with
  • Some magic Jenkins thing

CodePudding user response:

It's the shell from which ./segfault is invoked. If you switch to dash on your computer and run the same command you won't see the Segmentation fault message there either.

CodePudding user response:

  1. When defining any variable/object always initialize it.
  2. When using any pointer, use them only if not NULL
  3. Following sample program calling user defined function "mysig" when receiving SIGSEGV signal. When we use NULL pointer, the program receiving signal SIGSEGV (segmentation fault). I have updated the code to call "mysig" when segmentation fault happens. Once happened, it is making default function to create core dump file. Once receiving that signal, it is
  4. Before executing the program:
$ ulimit -c unlimited
$ ./a.out

Above program will create a core file. Using the core file, we can analyse where the program crashed using gdb.exe/gdb/dbx based on your operating system. gdb tutorial online:

https://www.tutorialspoint.com/gnu_debugger/index.htm

Sample program:

#include <sys/signal.h>
#include <stdio.h>
void mysig( int sig)
{
        switch( sig )
        {
                case SIGSEGV:
                        printf("Never use NULL pointer\n");
                        signal( SIGSEGV, SIG_DFL);
                        break;
                default:
                        printf( "Other signal: %d\n", sig);
        }
}
int main()
{
        int* p = NULL;
        signal( SIGSEGV, mysig);
        if ( NULL == p )
        {
                printf("Never use NULL pointer\n");
        }
        else
        {
                printf("%d\n", *p);
        }
        printf( "Make call to mysig by using NULL pointer\n");
        printf("%d\n", *p);
        return 0;
}

Sample output:

$ gcc  -g -Wall 73596388.cpp -o ./a.out
$ ./a.out
Never use NULL pointer
Make call to mysig by using NULL pointer
Never use NULL pointer
Segmentation fault (core dumped)
$ # I compiled using bash.exe/gcc.exe at windows:
$ gdb a.out
Reading symbols from a.out...
(gdb) break main
Breakpoint 1 at 0x1004010d8: file 73596388.cpp, line 17.
(gdb) run
Starting program: ./a.out
[New Thread 10964.0x219c]
[New Thread 10964.0x2090]
[New Thread 10964.0x1dd4]

Thread 1 "a.out" hit Breakpoint 1, main () at 73596388.cpp:17
17              int* p = NULL;
(gdb) next
18              signal( SIGSEGV, mysig);
(gdb) next
19              if ( NULL == p )
(gdb) next
21                      printf("Never use NULL pointer\n");
(gdb) next
Never use NULL pointer
27              printf( "Make call to mysig by using NULL pointer\n");
(gdb) next
Make call to mysig by using NULL pointer
28              printf("%d\n", *p);
(gdb) step

Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
0x0000000100401136 in main () at 73596388.cpp:28
28              printf("%d\n", *p);
(gdb) step
Never use NULL pointer

Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
0x0000000100401136 in main () at 73596388.cpp:28
28              printf("%d\n", *p);
(gdb) step
      0 [main] a.out 33539 cygwin_exception::open_stackdumpfile: Dumping stack trace to a.out.stackdump
                                                                                                       [Thread 10964.0x2adc exited with code 35584]
[Thread 10964.0x2090 exited with code 35584]
[Thread 10964.0x219c exited with code 35584]
[Inferior 1 (process 10964) exited with code 0105400]
(gdb)
  • Related