Home > OS >  Passing std::memory_order as constexpr differs from when it is not constexpr
Passing std::memory_order as constexpr differs from when it is not constexpr

Time:04-16

This is a question regarding std::memory_order.

I have created the following two functions which are to be run on separate threads.

constexpr std::memory_order order = std::memory_order_relaxed;

void write()  // writer thread
{   
    x.store(true, order);
    y.store(true, order); 
}

void read()   // reader thread
{
    while (!y.load(order)) {}
    assert(x.load(order)); 
}

On an Arm host, the assert can fire, as the memory order I pass in the store and load methods are std::memory_order_relaxed. However, if I strip the "constexpr" off, the assert will never fire.

It seems to me that the value of the std::memory_order of the store and load methods must be determined in compile-time. If it is deferred to run-time, however, the behaviour will be undefiend.

Am I correct? If that is true, is there any C documentation emphasizing this limitation?

CodePudding user response:

constexpr doesn't matter. The assertion can (but doesn't need to) fail in either case. You are just getting lucky.

There is no undefined behavior if the order is not a compile-time constant.

Possibly (although impossible to tell without seeing generated assembly) is that the compiler wasn't able to determine that the value of order is a compile-time constant without the constexpr and so didn't apply some optimization/transformation based on that.

CodePudding user response:

As user17732522 says, it is perfectly legal for the memory_order to not be a constant, and the behavior is well defined. There is nowhere in the standard that says it must be a constant.

I am guessing you are using gcc as your compiler. It appears that with gcc specifically, when the memory_order argument is not a compile-time constant, it simply ignores the argument and emits an unconditional seq_cst operation. This is certainly legal; it is always safe to provide stronger ordering than you requested. They probably think that the performance difference between seq_cst and a potentially weaker order is small, and not worth the cost of a sequence of tests and branches to select different instructions at runtime.

(Hopefully it goes without saying that you should not rely on this behavior! If you only request relaxed ordering, your code had better be correct in the event you actually get relaxed ordering.)

Try on godbolt. With gcc, on all the architectures I tested, the function unknown() has identical assembly code to cst(), which is different from rel() (relaxed store).

Testing a couple other compilers:

  • icc: emits seq_cst unconditionally

  • icx: test and branch

  • clang: test and branch

  • Related