Home > Enterprise >  Using Java volatile keyword in non-thread scenario
Using Java volatile keyword in non-thread scenario

Time:12-22

I understand that the Java keyword volatile is used in multi-threading context; the main purpose is to read from the memory rather than from the cache or even if read from the cache, it would be updated first.

In the below example, there is no multi-threading concept. I want to understand if the variable i would be cached as a part of code optimization and hence read from cpu cache rather than memory? If yes, if the variable is declared as volatile, will it certainly be read from the memory?

I have run the program multiple times, by adding and also by deleting the volatile keyword; but, since there is no constant time for the for loop, I was unable to come to a conclusion if more time is consumed when the variable is declared as volatile.

All I want to see is that the time taken from CPU cache is actually less than when it is declared as volatile.

Is my understanding even right? If yes, how can I see the concept in working, with a good record of the times for both CPU cache reads and memory reads?

import java.time.Duration;
import java.time.Instant;

public class Test {
    
    volatile static int i=0;
//  static int i=0;

    public static void main(String[] args) {
        Instant start = Instant.now();

        for (i=0; i<838_860_8; i  ) { // 2 power 23; ~ 1 MB CPU Cache
            System.out.println("i:"   i);
        }
        Instant end  = Instant.now();
        
        long timeElapsed = Duration.between(start, end).getSeconds();
        System.out.println("timeElapsed: "   timeElapsed   " seconds.");

    }
}

CodePudding user response:

I think that the answer is "probably yes" ... for current Java implementations.

There are two reasons that we can't be sure.

  1. The Java language specification doesn't actually say anything about registers, CPU caches or anything like that. What it actually says is that there is a happens before relationship between one thread writing the volatile and another thread (subsequently) reading it.

  2. While it is reasonable to assume that this will affect caching in the case where there are multiple threads, if the JIT compiler was able to deduce that the volatile variable was thread confined for a given execution of your application, it could reason that it can cache the variable.


That is the theory.

If there was a measurable performance difference, you would be able to measure it in a properly written benchmark. (Though you may get different results depending on the Java version and your hardware.)

However the current version of your benchmark has a number of flaws which would make any results it gives doubtful. If you want to get meaningful results, I strongly recommend that you read the following Q&A.

(Unfortunately some of the links in some of the answers seem to be broken ...)

CodePudding user response:

The premise of your benchmark is flawed; the cache is the source of truth. The cache coherence protocol makes sure that CPU caches are coherent; memory is just a spill bucket for whatever doesn't fit into the cache since most caches are write-behind (not write-through). In other words; a write of a volatile doesn't need to be written to main memory; it is sufficient to write it to the cache.

A few examples where writing to the cache isn't desired:

  • I/O DMA: you want to prevent writing to cache because otherwise, main memory and CPU cache could become incoherent.
  • Non-temporal data: e.g. you are processing some huge data set and you only access it once, there is no point in caching it.

But this is outside of the reach of a regular Java volatile.

There is a price to pay using volatile:

  1. Atomicity guarantees.
  2. Loads and stores can't be optimized out. This prevents many compiler optimizations.
  3. Ordering guarantees in the form of fences. On the X86 in the above case, the price is at the volatile store. It depends on how this code is compiled; either to a CAS for the increment or to a read/write. But in both cases the store needs to wait for the store in the store buffer to be committed to the cache.

Especially the last 2 will impact volatile performance.

Apart from that, your benchmark is flawed. First of all, I would switch to JMH as others already pointed out. It will take care of quite a few typical benchmark errors like warmup and dead code elimination. Also, you should not be using a System.out in the benchmark since this will completely dominate the performance.

  • Related