Home > Back-end >  Why in my code cpp compare_exchange_strong updates and return false
Why in my code cpp compare_exchange_strong updates and return false

Time:10-04

The problem:


So I'm pretty new to CPP and i was trying to implement a simple comparison code using some atomicity concepts.

The problem is that I'm not getting a desired result, that is: even after the compare_exchange_strong function updates the value of the atomic variable (std::atomic), it returns false.

Below is the program code:

CPP:
Action::Action(Type type, Transfer *transfer)
: transfer(transfer),
  type(type) {
    Internal = 0;
    InternalHigh = -1;
    Offset = OffsetHigh = 0;
    hEvent = NULL;
    status = Action::Status::PENDING;
}

BOOL CancelTimeout(OnTimeoutCallback* rt)
{
    auto expected = App::Action::Status::PENDING;

    if (rt->action->status.compare_exchange_strong(expected, App::Action::Status::CANCEL)) 
    {
        CancelWaitableTimer(rt->hTimer);

        return true;
    }

    return false;
}
HEADER:
struct Action : OVERLAPPED {
    enum class Type : long {
        SEND,
        RECEIVE
    };

    enum class Status : long {
        PENDING,
        CANCEL,
        TIMEOUT
    };
    
    atomic<Status> status;
    Transfer *transfer = NULL;
    Type type;
    WSABUF *data = NULL;
    OnTimeoutCallback *timeoutCallback;

    Action(Type type, Transfer *transfer);

    ~Action();
}

Reviewing, the value of the variable rt->action->status is updated to the Action::Status::CANCEL enum, but the return of the compare_exchange_strong function is false.

See the problem in debug:

The problem that happens in Visual Studio Community DEBUG

That said, the desired result is that the first breakpoint, referring to return true, would be triggered instead of return false, taking into account that it changed the value of the variable.

UPDATE: In the print I removed the first Breakpoint by accident, but I think it was understandable

Attempts already made


  1. Modify the structure to: enum class Status : long
  2. Modify the structure to: enum class Status : size_t
  3. Modify the positions of all structure items

Similar topics already searched

[but without success]

Link Search term
Why does compare_exchange_strong fail with std::atomic<double>, std::atomic<float> in C ? compare_exchange_strong fail
cpp compare_exchange_strong fails spuriously? compare exchange fails
Don't really get the logic of std::atomic::compare_exchange_weak and compare_exchange_strong std::atomic::compare_exchange_weak and compare_exchange_strong
Does C 14 define the behavior of bitwise operators on the padding bits of unsigned int? Padding problem compare exchange

Among several other topics with different search words

Importante Notes


  1. The code is multi-threaded
  2. There is nowhere else in the code where the value of the atomic variable is being updated to the enum Action::Status::CANCEL
  3. I suspect it's something to do with padding (due to some Google searches), but as I'm new to CPP, I don't know how to modify my framework to solve the problem
  4. A new instance of the Action structure is generated at each request, and I also made sure that there is no concurrency occurring on the same pointer (Action*), because with each change, a new instance of the Action structure is generated

WAIT!


It is worth mentioning that I am using Google Translate to post this question, in case something is not right, if my question is incomplete, or is formatted in an inappropriate way, please comment so I can adjust it, thank you in advance,

Lucas P.

Updates:


I was not able to replicate the problem using a minified version of the code, that being said, I have to post the entire solution (which in turn is already quite small, as it is a project for studies):

https://drive.google.com/file/d/13fP7OUCC6GeMgUtrPHSOnSGUEBwDGqBC/view?usp=sharing

CodePudding user response:

TL:DR: race condition between another thread modifying it vs. the debugger getting control and reading memory of the process being debugged.

Or the value had been Action::Status::CANCEL for a long time, not expected = App::Action::Status::PENDING;, in which case a single thread running alone could have this behaviour. I assume your program expects this CAS to fail only when two threads are trying to do this around the same time, like only calling this function in the first place if something was pending.


I assume there's another thread that could call CancelTimeout at the same time, otherwise you wouldn't need an atomic RMW. (If this was the only thread that modified it, you'd just check the value, and do a pure store of the new value after a manual compare, like .store(CANCEL), perhaps with std::memory_order_release or relaxed.)

This would explain your observations:

  • Another thread won the race to modify rt->action->status, so its CAS returned true.

  • CAS_strong in this thread didn't modify the variable, and returned false.

  • The if body in this thread didn't run, so this thread hit your breakpoint.

  • After the debugger eventually got control and all threads of the process were paused, the debugger asked the kernel to read memory of the process being debugged. Since our CAS failed, the other thread's update of rt->action->status must have already happened, so the debugger will see it.

    (Especially after all the time it takes for the debugger to get control, the dust will have time to settle. But assuming you're using an x86 or ARMv8, stores in one thread being visible to any other thread mean they're globally visible, to all threads; those ISAs are multi-copy atomic, no IRIW reordering.)

So CAS failed precisely because some other thread already changed the value. It wasn't changed by the thread where CAS failed. Your breakpoint will trigger whenever CAS fails, regardless of the value before or after the CAS.


For CAS_strong to actually return false and update the value, your compiler or CPU would have to be buggy. Those are possible (especially a compiler bug), but are extraordinary claims that require very carefully ruling out software causes of the same observations. That should never be your first guess when you haven't yet sorted out all the details and aren't sure you understand everything that's going on.

If you think a primitive operation didn't do what the docs said it does, it's almost always actually a bug somewhere else, or missing some possible explanation for what you're seeing that doesn't require a compiler bug to explain.

It's fine to ask a Stack Overflow question about what's going on, but keep in mind when writing your title that it's extremely unlikely that your C compiler is actually broken.

  • Related