The problem:
So I'm pretty new to CPP and i was trying to implement a simple comparison code using some atomicity concepts.
The problem is that I'm not getting a desired result, that is: even after the compare_exchange_strong function updates the value of the atomic variable (std::atomic
), it returns false.
Below is the program code:
CPP:
Action::Action(Type type, Transfer *transfer)
: transfer(transfer),
type(type) {
Internal = 0;
InternalHigh = -1;
Offset = OffsetHigh = 0;
hEvent = NULL;
status = Action::Status::PENDING;
}
BOOL CancelTimeout(OnTimeoutCallback* rt)
{
auto expected = App::Action::Status::PENDING;
if (rt->action->status.compare_exchange_strong(expected, App::Action::Status::CANCEL))
{
CancelWaitableTimer(rt->hTimer);
return true;
}
return false;
}
HEADER:
struct Action : OVERLAPPED {
enum class Type : long {
SEND,
RECEIVE
};
enum class Status : long {
PENDING,
CANCEL,
TIMEOUT
};
atomic<Status> status;
Transfer *transfer = NULL;
Type type;
WSABUF *data = NULL;
OnTimeoutCallback *timeoutCallback;
Action(Type type, Transfer *transfer);
~Action();
}
Reviewing, the value of the variable rt->action->status
is updated to the Action::Status::CANCEL
enum, but the return of the compare_exchange_strong function is false.
See the problem in debug:
That said, the desired result is that the first breakpoint, referring to return true, would be triggered instead of return false, taking into account that it changed the value of the variable.
UPDATE: In the print I removed the first Breakpoint by accident, but I think it was understandable
Attempts already made
- Modify the structure to:
enum class Status : long
- Modify the structure to:
enum class Status : size_t
- Modify the positions of all structure items
Similar topics already searched
[but without success]
Link | Search term |
---|---|
Why does compare_exchange_strong fail with std::atomic<double>, std::atomic<float> in C ? | compare_exchange_strong fail |
cpp compare_exchange_strong fails spuriously? | compare exchange fails |
Don't really get the logic of std::atomic::compare_exchange_weak and compare_exchange_strong | std::atomic::compare_exchange_weak and compare_exchange_strong |
Does C 14 define the behavior of bitwise operators on the padding bits of unsigned int? | Padding problem compare exchange |
Among several other topics with different search words
Importante Notes
- The code is multi-threaded
- There is nowhere else in the code where the value of the atomic
variable is being updated to the enum
Action::Status::CANCEL
- I suspect it's something to do with padding (due to some Google searches), but as I'm new to CPP, I don't know how to modify my framework to solve the problem
- A new instance of the Action structure is generated at each request, and I also made sure that there is no concurrency occurring on the same pointer (Action*), because with each change, a new instance of the Action structure is generated
WAIT!
It is worth mentioning that I am using Google Translate to post this question, in case something is not right, if my question is incomplete, or is formatted in an inappropriate way, please comment so I can adjust it, thank you in advance,
Lucas P.
Updates:
I was not able to replicate the problem using a minified version of the code, that being said, I have to post the entire solution (which in turn is already quite small, as it is a project for studies):
https://drive.google.com/file/d/13fP7OUCC6GeMgUtrPHSOnSGUEBwDGqBC/view?usp=sharing
CodePudding user response:
TL:DR: race condition between another thread modifying it vs. the debugger getting control and reading memory of the process being debugged.
Or the value had been Action::Status::CANCEL
for a long time, not expected = App::Action::Status::PENDING;
, in which case a single thread running alone could have this behaviour. I assume your program expects this CAS to fail only when two threads are trying to do this around the same time, like only calling this function in the first place if something was pending.
I assume there's another thread that could call CancelTimeout
at the same time, otherwise you wouldn't need an atomic RMW. (If this was the only thread that modified it, you'd just check the value, and do a pure store of the new value after a manual compare, like .store(CANCEL)
, perhaps with std::memory_order_release or relaxed.)
This would explain your observations:
Another thread won the race to modify
rt->action->status
, so its CAS returned true.CAS_strong
in this thread didn't modify the variable, and returned false.The
if
body in this thread didn't run, so this thread hit your breakpoint.After the debugger eventually got control and all threads of the process were paused, the debugger asked the kernel to read memory of the process being debugged. Since our CAS failed, the other thread's update of
rt->action->status
must have already happened, so the debugger will see it.(Especially after all the time it takes for the debugger to get control, the dust will have time to settle. But assuming you're using an x86 or ARMv8, stores in one thread being visible to any other thread mean they're globally visible, to all threads; those ISAs are multi-copy atomic, no IRIW reordering.)
So CAS failed precisely because some other thread already changed the value. It wasn't changed by the thread where CAS failed. Your breakpoint will trigger whenever CAS fails, regardless of the value before or after the CAS.
For CAS_strong to actually return false and update the value, your compiler or CPU would have to be buggy. Those are possible (especially a compiler bug), but are extraordinary claims that require very carefully ruling out software causes of the same observations. That should never be your first guess when you haven't yet sorted out all the details and aren't sure you understand everything that's going on.
If you think a primitive operation didn't do what the docs said it does, it's almost always actually a bug somewhere else, or missing some possible explanation for what you're seeing that doesn't require a compiler bug to explain.
It's fine to ask a Stack Overflow question about what's going on, but keep in mind when writing your title that it's extremely unlikely that your C compiler is actually broken.