Home > OS >  Why g O2 option make unsigned wrap around not working?
Why g O2 option make unsigned wrap around not working?

Time:12-02

I was trying to write a queue with c , and I learn from intel dpdk libring that I can do that by writing codes like that using the unsigned wrap around property:

#include <cstdio>
#include <cassert>
#include <atomic>
#include <thread>

size_t global_r = 0, global_w = 0, mask_ = 3;

void emplace() {
  unsigned long local_w, local_r, free_entries = 0;
  local_w = global_w;
  while (free_entries == 0) {
    local_r = global_r;
    free_entries = (mask_   local_r - local_w);
  }
  fprintf(stderr, "%lu\n", free_entries);
  auto w_next = local_w   1;
  std::atomic_thread_fence(std::memory_order_release);
  global_w = w_next;
}

void pop() {
  unsigned long local_r = global_r;
  unsigned long r_next = local_r   1;
  // make sure nobody can write to it before destruction
  std::atomic_thread_fence(std::memory_order_release);
  global_r = r_next;
}

int main() {
  std::jthread([]() -> void {
    int i = 10;
    while (i-- >= 0) emplace();
  });
  std::jthread([]() -> void {
    int i = 10;
    while (i-- >= 0) pop();
  });
  return 0;
}

when I run it with g O0 and O2, it produce different results: with O2:

3
2
1
0
18446744073709551615
18446744073709551614
18446744073709551613
18446744073709551612
18446744073709551611
18446744073709551610
18446744073709551609

without O2:

3
2
1
.....long time suspending

I wonder is there any wrong with my understanding of unsinged wrap around? (I learn from several stackoverflow post and other references that unsinged wrap around is defined behavior).

CodePudding user response:

Are you aware that once global_w is incremented to 3 then the while loop in emplace() becomes an infinite loop? AFAIK, infinite loops result in undefined behavior in C .

I believe your problem comes from the fact that you define std::jthread objects as temporaries. This means that they are destructed at the end of expression where they emerge. Consequently, both threads do not run in parallel (at the same time).

You can easily change that by creating thread variables, which will be destructed at the end of main():

int main()
{
  std::thread t1 ([]() -> void {  // note that "t1" variable name
    int i = 10;
    while (i-- >= 0) emplace();
  });

  std::thread t2 ([]() -> void {  // note that "t2" variable name
    int i = 10;
    while (i-- >= 0) pop();
  });
}

However, even then, I think you have a data race on global_r, which results in undefined behavior as well. Without its synchronized writes, a compiler may easily suppose that emplace() has exclusive access to global_r and effectively "remove" this read local_r = global_r; from the loop.

Live demo of this type of problem: https://godbolt.org/z/566WP9n36.

  • Related