Confused about the use of C (static) thread_local declared inside a function passed to (j)thread-CodePudding

I started to see a few C related posts on Stackoverflow in which people suggest to use thread_local within the function that is passed to (j)thread. For example:

How do I generate thread-safe uniform random numbers?

Say we have something like this:

#include <thread>
#include <random>
#include <mutex>

void thread_function()
{
    static thread_local std::default_random_engine gen;
    std::uniform_real_distribution<float> dist(0.0f, 1.f);
    unsigned int a{ 1 };

    float b = a * dist(gen);

    {
        std::lock_guard<std::mutex> lock(cout_mutex);
        std::cout << "b: " << b << '\n';
    }
}

int main()
{
    std::jthread A(thread_function);
    std::jthread B(thread_function);
    A.join();
    B.join();

    return 0;
}

Isn't the random engine and the variable a both stored on the thread's stack? My understanding was that thread_local should be use like so:

I took this example from https://en.cppreference.com/w/cpp/language/storage_duration

#include <iostream>
#include <string>
#include <thread>
#include <mutex>

thread_local unsigned int rage = 1; 
std::mutex cout_mutex;

void increase_rage(const std::string& thread_name)
{
      rage; // modifying outside a lock is okay; this is a thread-local variable
    std::lock_guard<std::mutex> lock(cout_mutex);
    std::cout << "Rage counter for " << thread_name << ": " << rage << '\n';
}

int main()
{
    std::thread a(increase_rage, "a"), b(increase_rage, "b");
 
    {
        std::lock_guard<std::mutex> lock(cout_mutex);
        std::cout << "Rage counter for main: " << rage << '\n';
    }
 
    a.join();
    b.join();
}

Possible output:

Rage counter for a: 2
Rage counter for main: 1
Rage counter for b: 2

In this particular case, it makes sense to me, since the variable rage is declared at the global scope but because it's declared as thread_local, each thread owns a similar variable that threads can edit independently from each other.

But then shouldn't this be equivalent to?

#include <iostream>
#include <string>
#include <thread>
#include <mutex>

std::mutex cout_mutex;

void increase_rage(const std::string& thread_name)
{
    unsigned int rage = 1; 
      rage; // modifying outside a lock is okay; this is a thread-local variable
    std::lock_guard<std::mutex> lock(cout_mutex);
    std::cout << "Rage counter for " << thread_name << ": " << rage << '\n';
}

int main()
{
    std::thread a(increase_rage, "a"), b(increase_rage, "b");
 
    {
        std::lock_guard<std::mutex> lock(cout_mutex);
        //std::cout << "Rage counter for main: " << rage << '\n';
    }
 
    a.join();
    b.join();
}

Of course in this example rage isn't available in the main function any longer, yet this raises 2 questions:

Does it make sense to declare a variable thread_local in the thread function? Or is it intended to be used like in the cppreference example - ... only (without making the first example however illegal yet useless)?
If it makes sense to have it used in the thread function as well (say in thread_function) what's the difference between the variable that's declared as thread_local and the variable a that is not. To me both are stored on the thread's stack (and are "local to the thread")?

Many thanks for your kind explanation>

CodePudding user response：

void thread_function()
{
    static thread_local std::default_random_engine gen;
     unsigned int a{ 1 };
}
Isn't the random engine and the variable a both stored on the thread's stack?

The gen variable not stored on any stack. It's static. A static local variable can only be accessed from within the block where it is declared, but other than that, it behaves exactly like a global variable. It gets initialized one time before its first use, and then after that, it continues to exist for the lifetime of the program. Upon coming back into the block for the Nth time, it will have whatever value it had when some thread left the block for the (N-1)th time.

The gen variable also is, thread_local, which means that a different version of it exists for each different thread that enters the block.

After that though, the example is a bit hard to explain. Each thread will construct a default_random_engine on its first call to thread_function(), but then that "engine" object never is used. Also, the second line unsigned int a{1}; can just be thrown away by the compiler. The a variable appears to serve no purpose at all.

CodePudding user response：

This is a question of storage duration.

1. automatic storage duration:

void increase_rage(const std::string& thread_name)
{
    unsigned int rage = 1;

Here rage is created and destroyed within the scope of each function invocation. Each time the function is invoked, a new instance is created and initialized to 1. So it won't work for the purpose of counting invocations.

2. static storage duration:

void increase_rage(const std::string& thread_name)
{
    static unsigned int rage = 1;  // or at global scope

Here the variable is allocated once in the program's data segment (not on stack). In this case the value will persist between invocations. It will have only one value shared by all threads and will require synchronization to access from multiple threads (that can be solved using a mutex or std::atomic<int>).

3. thread storage duration:

void increase_rage(const std::string& thread_name)
{
    static thread_local unsigned int rage = 1;

Here the variable is allocated once for each thread (in TLS storage) and deallocated when that thread ends. It can count function invocations per thread and does not require synchronization.

Note that static is implied when thread_local is used at block scope, so we can omit it here.