Home > Mobile >  thread_local shared_ptr object is causing sigsegv when destructing
thread_local shared_ptr object is causing sigsegv when destructing

Time:01-23

I have a program, which is using thread_local std::shared_ptr to manage some objects that are mainly accessed thread-locally. However when the thread is joined and the thread local shared_ptr is destructing, there is always SIGSEGV when debugging if the program is compiled by MinGW (Windows 10). Here is a minimum code to reproduce the bug:

// main.cpp
#include <memory>
#include <thread>

void f() {
    thread_local std::shared_ptr<int> ptr = std::make_shared<int>(0);
}

int main() {
    std::thread th(f);
    th.join();
    return 0;
}

How to compile:

g   main.cpp -o build\main.exe -std=c  17

Compiler version:

>g   --version
g   (x86_64-posix-seh-rev2, Built by MinGW-W64 project) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Run using gdb it will give SIGSEGV in new thread, when the main thread is waiting for join(). It works fine when compiled by gcc, clang (Linux) and MSVC (Windows).

I tried to debug and found that, a continuous segment of memory containing the thread local shared_ptr was erased to repeated 0xfeeefeee before destruction when calling RtlpWow64SetContextOnAmd64. The frames:

RtlpWow64SetContextOnAmd64 0x00007ffd8f4deb5f
RtlpWow64SetContextOnAmd64 0x00007ffd8f4de978
SbSelectProcedure 0x00007ffd8f4ae2e0
CloseHandle 0x00007ffd8ce3655b
pthread_create_wrapper 0x00007ffd73934bac
_beginthreadex 0x00007ffd8e9baf5a
_endthreadex 0x00007ffd8e9bb02c
BaseThreadInitThunk 0x00007ffd8ec87614
RtlUserThreadStart 0x00007ffd8f4c26a1

The assembly:

...
mov    %rax,(%rdi)
movdqu %xmm0,(%rsi)               ; <------ erased here
call   0x7ffd8f491920             ; <ntdll!RtlReleaseSRWLockShared>
mov    $0x1,%r9d
mov    0x30(%rsp),%rbx
...

later the shared_ptr is destructed, and when reading 0xfeeefeee there is SIGSEGV.

I want to know that:

  • Why MinGW (or Windows library?) is erasing the thread local storage before destruction? In my opinion erasing memory should only happen after the destruction. I notice that if join() is replaced by detach(), the program exits normally. Maybe join() did something to instruct the new thread to erase the storage?
  • Is such behavior a violation of standard? I think the standard should forbid erasing the memory before destruction. Please correct me if I'm mistaken.

CodePudding user response:

This is a longstanding, open and known bug in mingw, see the corresponding issue with analyses and links on github: https://github.com/msys2/MINGW-packages/issues/2519

Yes, this violates the standard: it shouldn't crash. Basically the order of destruction is incorrect, as you already suspected. The 0xfeeefeee is the magic number used by HeapFree() to mark the freed memory. See e.g. this post.

To quote lhmouse:

So here comes the rule of thumb: Don't use thread_local on GCC for MinGW targets.

  • Related