I'm trying to write a small testcase to exercise std::shared_timed_mutex::try_lock_until
. Documentation on cppreference.
Here is my code
#include <thread>
#include <iostream>
#include <chrono>
#include <shared_mutex>
#include <cassert>
std::shared_timed_mutex test_mutex;
int global;
void f()
{
auto now=std::chrono::steady_clock::now();
test_mutex.try_lock_until(now std::chrono::seconds(100));
//test_mutex.lock();
--global;
std::cout << "In lock, global=" << global << '\n';
test_mutex.unlock();
}
void g()
{
auto now=std::chrono::steady_clock::now();
test_mutex.try_lock_shared_until(now std::chrono::seconds(10));
//test_mutex.lock_shared();
std::cout << "In shared lock, global=" << global << '\n';
test_mutex.unlock_shared();
}
int main()
{
global = 1;
test_mutex.lock_shared();
std::thread t1(f);
std::thread t2(g);
test_mutex.unlock_shared();
t1.join();
t2.join();
assert(global == 0);
}
What I'm expecting is
- main gets a read lock then starts f and g
- f tries to get an exclusive lock and blocks
- g gets a read lock, reads
global
then unlocks the read lock - main unlocks the read lock
- f unblocks, write to
global
, unlockes and finishes - f and g join 7 the assert is true and main ends
(2 and 3 could be in any order).
This seems to work OK on its own. Under gdb, if I put breakpoints on the read of global
in g
and the write in f
, then run, it stops on the read, as I would expect.
Howver, if I compile with -fsanitize=tthread
then I get a hazard
WARNING: ThreadSanitizer: data race (pid=6780)
Read of size 4 at 0x000000407298 by thread T2:
#0 g() /home/paulf/scratch/valgrind/drd/tests/try_lock_shared_until14.cpp:25 (try_lock_shared_until14 0x402484)
[trimmed]
#6 execute_native_thread_routine ../../../../../libstdc -v3/src/c 11/thread.cc:82 (libstdc .so.6 0xd9c83)
Previous write of size 4 at 0x000000407298 by thread T1:
#0 f() /home/paulf/scratch/valgrind/drd/tests/try_lock_shared_until14.cpp:15 [triimed]
#6 execute_native_thread_routine ../../../../../libstdc -v3/src/c 11/thread.cc:82 (libstdc .so.6 0xd9c83)
Location is global 'global' of size 4 at 0x000000407298 (try_lock_shared_until14 0x000000407298)
Under gdb the tsan version does not block on the exclusive lock and reaches the write first.
I realize that my example is not good and I should check return values and not rely on timeouts.
Can anyone explain what tsan is changing? If I use the plain lock/lock_shared/unlock/unlock_shared functions then tsan no longer complains.
(Note that I can't use DRD or Helgrind for this - I'm writing the testcase for them and I know that they do not support this at the moment, at least not on the platform that I'm using, Fedora 34 / GCC 11.2.1 amd64).
Edit:
Here is version 3, which works now. main waits on a cv got g()
to finish, then releases the shared lock, then f()
can get the exclusive lock.
#include <thread>
#include <iostream>
#include <chrono>
#include <shared_mutex>
#include <mutex>
#include <cassert>
#include <condition_variable>
std::shared_timed_mutex test_mutex;
std::mutex cv_mutex;
std::condition_variable cv;
int global;
bool reads_done = false;
void f()
{
auto now=std::chrono::steady_clock::now();
std::cout << "In lock, trying to get mutex\n";
if (test_mutex.try_lock_until(now std::chrono::seconds(3)))
{
--global;
std::cout << "In lock, global=" << global << '\n';
test_mutex.unlock();
}
else
{
std::cerr << "Lock failed\n";
}
}
void g()
{
auto now=std::chrono::steady_clock::now();
std::cout << "In shared lock, trying to get mutex\n";
if (test_mutex.try_lock_shared_until(now std::chrono::seconds(2)))
{
std::cout << "In shared lock, global=" << global << '\n';
test_mutex.unlock_shared();
}
else
{
std::cerr << "Lock shared failed\n";
}
std::unique_lock<std::mutex> lock(cv_mutex);
reads_done = true;
cv.notify_all();
}
int main()
{
global = 1;
test_mutex.lock_shared();
std::thread t1(f);
std::thread t2(g);
{
std::unique_lock<std::mutex> lock(cv_mutex);
while (!reads_done)
{
cv.wait(lock);
}
}
std::cout << "Main, reader thread done\n";
test_mutex.unlock_shared();
std::cout << "Main, no more shared locks\n";
t1.join();
t2.join();
assert(global == 0);
}
CodePudding user response:
This is also a valid scheduling scenario:
- main gets a read lock then starts f and g
- main releases read lock then joins
- f starts execution and locks for over 10 ms, e.g. due to preemption
- g starts execution and blocks for 10 ms
- g unblocks and reads the shared variable
In 5., a data race happens, and therefore ThreadSanitizer is correct to point it out. The correction of this error would require to check for the return values of try_lock_shared_until
and so on.