I'm trying to figure out why a benchmark of Java's BigInteger multiplication is consistently 3x faster than using an instance from BigInteger.java source code copied over to my project. Using jmh to run the benchmark. Here is an example output, note that addition runs about the same.
Benchmark Mode Cnt Score Error Units
BenchmarkTest.javaBigInteger_add thrpt 5 856062.338 � 34040.923 ops/s
BenchmarkTest.sourceBigInteger_add thrpt 5 842421.746 � 39630.112 ops/s
BenchmarkTest.javaBigInteger_multiply thrpt 5 525649.635 � 15271.083 ops/s
BenchmarkTest.sourceBigInteger_multiply thrpt 5 133944.766 � 1832.857 ops/s
Reason I am doing this is in an attempt to port part of this to Kotlin I noticed the benchmarks were a bit slower. To see if it was related to Kotlin I removed that from the picture and did everything in pure java with the exact same results. Why would this benchmark vary so much if the source code/algorithms are exactly the same?
Project with this code: https://github.com/asheragy/BigInteger-Benchmark
CodePudding user response:
Some, maybe even most, JVMs have a number of intrinsic functions that are used instead of the Java code for various computationally intensive operations. More details can be found in the answers to this question. One of these intrinsics, multiplyToLen
, deals specifically with BigInteger
multiplication.
After disabling the usage of this intrinsic function by changing the jmh
configuration in the build.gradle
file as follows:
jmh {
warmupIterations = 2
iterations = 5
fork = 1
jvmArgsPrepend = ['-XX: UnlockDiagnosticVMOptions', '-XX:-UseMultiplyToLenIntrinsic']
}
I'm getting the following multiplication benchmark results on OpenJDK 11 (JDK 11.0.11, OpenJDK 64-Bit Server VM, 11.0.11 8-jvmci-21.1-b05
) on an x86-64 architecture:
Benchmark Mode Cnt Score Error Units
BenchmarkTest.javaBigInteger thrpt 5 107476.070 ± 8059.020 ops/s
BenchmarkTest.sourceBigInteger thrpt 5 108011.737 ± 7105.221 ops/s
This almost completely closes the gap between both implementations.
There might be other configuration options that play a smaller role, but I think in general the answer will be that there are various compiler/runtime optimizations in place for standard JDK classes that are not applied to custom implementations.