my task crashed on production server, and I downloaded the binary and the core dump. I then run :
gdb task coredump
And I can do some basic debugging in gdb including bt, frame, info locals
etc. I have identified a variable that it's content look weird to me. Assuming I am in here :
(gdb) frame 8
....
(gdb) list
...
(gdb) print d_variable
....
I now want to go a few lines up, and inspect d_variable and how it got populated or at what point the value isn't correct. I can add a break point and run my task, but this as far as I can tell, it doesn't use the existing coredump and I cannot reproduce the error.
The question is, can I run the same coredump, with breakpoints this time so I can inspect how this "abnormal" value occurred?
I am not very experienced with gdb so hope the above makes sense.
CodePudding user response:
can I run the same coredump, with breakpoints this time so I can inspect how this "abnormal" value occurred?
No.
In order to achieve what you want, you need need to record the crash under a reversible debugger, such as rr.
Since you haven't done so, your only option is to guess where the variable became corrupt, add logging and hope that the future crash will provide more info.
Before doing that, I suggest:
- instrumenting all your unit tests (you do have unit tests, right?) with
-fsanitize=address
(assuming your platform is supported) and making sure they are AddressSanitizer-clean. - instrumenting the actual server with AddressSanitizer and running it though load-test.
There is a high chance that doing above steps will reveal one or more bugs.