I built the program by classic configure, make, make install. Some months after, the program crashed. I still have the build directory where both the source and the non-stripped executable reside. From there, I call gdb like so:
530-north:courier$ gdb -q --core /tmp/core_epoch\=1667475742_pid\=23653_file\=\!usr\!local\!libexec\!courier\!courierd courierd
Reading symbols from courierd...
[New LWP 23653]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/libexec/courier/courierd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000561e841e5afd in msgq::completed(drvinfo&, unsigned long) ()
(gdb) info args
No symbol table info available.
With bt
I can see a long sequence of calls between two functions:
#0 0x0000561e841e5afd in msgq::completed(drvinfo&, unsigned long) ()
#1 0x0000561e841e609a in msgq::startdelivery(drvinfo*, delinfo*) ()
#2 0x0000561e841e5bd8 in msgq::completed(drvinfo&, unsigned long) ()
#3 0x0000561e841e609a in msgq::startdelivery(drvinfo*, delinfo*) ()
#4 0x0000561e841e5bd8 in msgq::completed(drvinfo&, unsigned long) ()
...
#204 0x0000561e841e5a17 in msgq::completed(drvinfo&, unsigned long) ()
#205 0x0000561e841e609a in msgq::startdelivery(drvinfo*, delinfo*) ()
#206 0x0000561e841e5a17 in msgq::completed(drvinfo&, unsigned long) ()
#207 0x0000561e841e70fe in courierbmain() ()
#208 0x0000561e841dd030 in main ()
Every couple of calls advances the stack by 0x110, for a total of ~27Kb, which is much less of the running processes' allocated 132Kb of stack, so it's not stack overflow. SIGSEGV could be from a null pointer or whatever. Why doesn't gdb point at it? This is GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git, BTW.
If I omit the last argument to gdb, bt
doesn't show the function names. Did I screw up compilation? On config.log I see I had 'CFLAGS= -march=nocona -O2 -g' 'LDFLAGS= -march=nocona -O2' 'CXXFLAGS= -march=nocona -O2 -std=c 11'
. The source file is C . Perhaps I missed some -g
s? Yet, some symbols are there...
CodePudding user response:
Why doesn't gdb point at it?
Because you haven't compiled your program with appropriate debug info.
You'll have to debug this crash at the assembly level. Start with disasemble $pc
and info registers
.
The source file is C . Perhaps I missed some -gs?
Yes: your CXXFLAGS
don't have -g
.
Yet, some symbols are there...
On UNIX systems (unlike Windows), function names (symbols) are present (by default) even without -g
. There is no contradiction here.
Update:
However, if I don't pass the non-stripped file as argument, the function names are not displayed.
Yes: strip
removes the symbols and debug info.
You can observe this by using a trivial test:
// t.cc
#include <cstdlib>
struct S {
void fn() { abort(); }
};
int main()
{
S().fn();
}
First let's see how it works when the binary is built correctly for debugging:
g -g t.cc -o a.out && strip ./a.out -o a.out.stripped &&
./a.out.stripped; gdb -q --batch -ex where ./a.out core
Aborted (core dumped)
...
warning: core file may not match specified executable file.
[New LWP 476070]
Core was generated by `./a.out.stripped'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007f12444895df in __pthread_kill_internal (signo=<optimized out>, threadid=<optimized out>) at ./nptl/pthread_kill.c:89
#2 __GI___pthread_kill (threadid=<optimized out>, signo=<optimized out>) at ./nptl/pthread_kill.c:89
#3 0x00007f12445f5e70 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007f1244428469 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000055de28a24165 in S::fn (this=0x7ffcd0d1d80f) at t.cc:4
#6 0x000055de28a2414d in main () at t.cc:9
Note presence of file/line info and function names. If we use the stripped version, neither is present:
ore was generated by `./a.out.stripped'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007f12444895df in __pthread_kill_internal (signo=<optimized out>, threadid=<optimized out>) at ./nptl/pthread_kill.c:89
#2 __GI___pthread_kill (threadid=<optimized out>, signo=<optimized out>) at ./nptl/pthread_kill.c:89
#3 0x00007f12445f5e70 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007f1244428469 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000055de28a24165 in ?? ()
#6 0x000055de28a2414d in ?? ()
#7 0x00007f124442920a in __libc_start_call_main (main=main@entry=0x55de28a24139, argc=argc@entry=1, argv=argv@entry=0x7ffcd0d1d928) at ../sysdeps/nptl/libc_start_call_main.h:58
#8 0x00007f12444292bc in __libc_start_main_impl (main=0x55de28a24139, argc=1, argv=0x7ffcd0d1d928, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcd0d1d918) at ../csu/libc-start.c:389
#9 0x000055de28a24071 in ?? ()
Now let's repeat with incorrectly built binary (which is what you have):
g t.cc -o b.out && strip ./b.out -o b.out.stripped &&
./b.out.stripped; gdb -q --batch -ex where ./b.out core
Aborted (core dumped)
...
warning: core file may not match specified executable file.
[New LWP 478614]
Core was generated by `./b.out.stripped'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007f21a0a895df in __pthread_kill_internal (signo=<optimized out>, threadid=<optimized out>) at ./nptl/pthread_kill.c:89
#2 __GI___pthread_kill (threadid=<optimized out>, signo=<optimized out>) at ./nptl/pthread_kill.c:89
#3 0x00007f21a0bf5e70 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007f21a0a28469 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000056049b052165 in S::fn() ()
#6 0x000056049b05214d in main ()
Notice presence of function names (S::fn()
, main
) but lack of file/line/argument info. This matches your observed result.
If you try again with b.out.stripped
, you'll get the same result as you had from previous run with a.out.stripped
:
Core was generated by `./b.out.stripped'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007f21a0a895df in __pthread_kill_internal (signo=<optimized out>, threadid=<optimized out>) at ./nptl/pthread_kill.c:89
#2 __GI___pthread_kill (threadid=<optimized out>, signo=<optimized out>) at ./nptl/pthread_kill.c:89
#3 0x00007f21a0bf5e70 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007f21a0a28469 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000056049b052165 in ?? ()
#6 0x000056049b05214d in ?? ()
#7 0x00007f21a0a2920a in __libc_start_call_main (main=main@entry=0x56049b052139, argc=argc@entry=1, argv=argv@entry=0x7fff3554bc78) at ../sysdeps/nptl/libc_start_call_main.h:58
#8 0x00007f21a0a292bc in __libc_start_main_impl (main=0x56049b052139, argc=1, argv=0x7fff3554bc78, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff3554bc68) at ../csu/libc-start.c:389
#9 0x000056049b052071 in ?? ()
In addition, readelf --debug-dump=info courierd shows lots of Version 4 stuff.
Yes, if you run readelf --debug-dump b.out
, you could observe a lot of DWARF4 stuff coming from crt0.o
, crtbegin.o
, etc (depending on how your GCC and GLIBC were built).
If you have .c
files linked in, these will also have DWARF4 debug info, since your CFLAGS
do include -g
.
But none of the DWARF4 stuff will be coming from wherever msgq::completed
is defined.