Home > Net >  Is Thread.yield guaranteed to flush reads/writes to memory?
Is Thread.yield guaranteed to flush reads/writes to memory?

Time:09-17

Lets save I have this code which exhibits stale cache reads by a thread, which prevent it from exiting its while loop.

class MyRunnable implements Runnable {
    boolean keepGoing = true; // volatile fixes visibility
    @Override public void run() {
        while ( keepGoing ) {
            // synchronized (this) { } // fixes visibility
            // Thread.yield(); // fixes visibility
            System.out.println(); // fixes visibility 
        }
    }
}
class Example {
    public static void main(String[] args) throws InterruptedException{
        MyRunnable myRunnable = new MyRunnable();
        new Thread(myRunnable).start();
        Thread.sleep(100);  
        myRunnable.keepGoing = false;
    }
}

I believe the Java Memory Model guarantees that all writes to a volatile variable synchronize with all subsequent reads from any thread, so that solves the problem.

If my understanding is correct the code generated by a synchronized block also flushes out all pending reads and writes, which serves as a kind of "memory barrier" and fixes the issue.

From practice I've seen that inserting yield and println also makes the variable change visible to the thread and it exits correctly. My question is:

Is yield/println/io serving as a memory barrier guaranteed in some way by the JMM or is it a lucky side effect that cannot be guaranteed to work?

CodePudding user response:

No. It is not guaranteed, by either the JLS or the javadocs for the classes or methods you are using there.

In current implementations, there are in practice memory barriers in yield() and println. (If you were to dig deeply into the implementation code, you should be able to figure out how they come about and what purpose they serve.)

However, there is no guarantee that these memory barriers will exist for all implementations of Java1 on all platforms. The specs do not specify that the happens before relations exist2, and therefore they do not require3 memory barriers to be inserted.

Hypothetically:

  • Suppose that Thread.yield() was implemented as a no-op. (In the same way that System.gc() can be a no-op.)

  • Suppose that the output stream stack was optimized in a way that it synchronization was no longer needed under the hood. For example, suppose that the JVM could deduce that an particular output stream was thread-confined, and there was no need for a memory barrier when writing to its buffer.

Now I don't personally think that those changes are likely to happen. (And they may not even be feasible.) But if they did happen, quite a few "broken" applications that currently depended on those serendipitous memory barriers would most likely stop working.

The point is: if you want guarantees, rely on what the specs say. The specs are the only real guarantee ... if your code needs to be portable.


1 - In particular, future ones.
2 - Indeed as Holger's answer explains, the javadocs for Thread clearly state that you cannot assume or rely on any synchronizing behavior happening for a yield(). That clearly means that there is no happens before between the yield() and any action on any other thread.
3 - The memory barriers are in fact an implementation detail. They are used by a typical compiler to implement the JMM's visibility guarantees. It is the guarantees that are the key, not the strategy used to implement them. Thus, any discussion of memory barriers, caches, registers, and so on is beside the point when you are trying to work out if multi-threaded code is correct.

CodePudding user response:

Lets save I have this code which exhibits stale cache reads by a thread, which prevent it from exiting its while loop.

If you are referring to CPU caches, then this is a bad mental model (apart from not a suitable mental model for the JMM). Caches on modern CPUs are always coherent.

I believe the Java Memory Model guarantees that all writes to a volatile variable synchronize with all subsequent reads from any thread, so that solves the problem.

That is correct. There is a happens before edge between a write of a volatile variable and all the subsequent reads of the same volatile variable.

Blockquote If my understanding is correct the code generated by a synchronized block also flushes out all pending reads and writes, which serves as a kind of "memory barrier" and fixes the issue.

It is dangerous to reason in terms of memory barriers in combination with the JMM.

https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/#myth-barriers-are-sane

There is a happens before edge between the release of a monitor and any subsequent acquire of that same monitor. So if you would access the keepGoing variable while it is protected by a lock, there is no data race.

Is yield/println/io serving as a memory barrier guarenteed in some way by the JMM or is it a lucky side effect that cannot be guaranteed to work?

Check the JLS and you will see there is no happens before edge between 2 yields. Perhaps there is a CPU memory barrier involved, but the problems could happen before the code hits the CPU. E.g. the JIT might optimize the code to:

if(!keepGoing){
   return;
}

while(true){
   Thread.yield();
   println();
}

So in this case the code is already 'broken' before it is executed on the CPU since the code will never see the updated version of the 'keepGoing' variable.

I'm not sure if the Thread.yield() has any compiler barriers, if there is a compiler barrier than the JIT can't optimize out the load or store. But none of this is part of the specification.

CodePudding user response:

Nothing in the specification guarantees flushing of any kind. This simply is the wrong mental model, assuming that there has to be something like a main memory that maintains a global state. But an execution environment could have local memory at each CPU without a main memory at all. So CPU 1 sending updated data to CPU 2 would not imply that CPU 3 knows about it.

In practice, systems have a main memory, but caches may get synchronized without the need to transfer the data to the main memory.

Further, discussing memory transfers end up in a tunnel vision. Java’s memory model also dictates, which optimizations a JVM may perform and which not. E.g.

nonVolatileVar = null;
Thread.sleep(100_000);
if(nonVolatileVar == null) {
  // do something
}

Here, the compiler is entitled to remove the condition, and perform the block unconditionally, as the preceding statement (ignoring the sleep) has written null and other thread’s activities are irrelevant for non-volatile variables, regardless of how much time has elapsed.

So when this optimization has been performed, it doesn’t matter how many threads write a new value to this variable and “flush to memory”. This code won’t notice.

So let’s consult the specification

It is important to note that neither Thread.sleep nor Thread.yield have any synchronization semantics. In particular, the compiler does not have to flush writes cached in registers out to shared memory before a call to Thread.sleep or Thread.yield, nor does the compiler have to reload values cached in registers after a call to Thread.sleep or Thread.yield.

I think, the answer to your question couldn’t be more explicit.

For completeness

I believe the Java Memory Model guarantees that all writes to a volatile variable synchronize with all subsequent reads from any thread, so that solves the problem.

All writes made prior to writing to a volatile variable will become visible to threads subsequently reading the same variable. So in your case, declaring keepGoing as volatile will fix the issue, as both threads consistently use it.

If my understanding is correct the code generated by a synchronized block also flushes out all pending reads and writes, which serves as a kind of "memory barrier" and fixes the issue.

A thread leaving a synchronized block establishes a happens-before relationship to a thread entering a synchronized block using the same object. If using a synchronized block in one thread appears to solve the issue despite you’re not using a synchronized block in the other, you’re relying on side effects of a particular implementation which is not guaranteed to continue to work.

  • Related