Setup
With standard C code (= no platform specific code), I have written a program to do the following:
- Get starting
clock()
- Open a file
- write a ~250MB long string to it using one of the below listed modes
- close the file.
- Repeat 2...4 10000 times as fast as possible, rip storage unit
- Get ending
clock()
- Do some time calculations and output
A)
bulk mode: Write everything at once (= one call tofwrite
)B)
chunk mode: Write string in chunks. One chunk is slightly more than 1MB. (= multiple calls tofwrite
, about ~250).
Then, I let the program run on two different computers.
Expection
I expect A)
being faster than B)
.
Results
Below was on my beefy PC with a Samsung 970 EVO M.2 SSD (CPU = AMD Ryzen 2700x: 8 cores / 16 threads). The output on this one is slightly wrong, it should've been Ns/file, not Ns/write)
Below was on my laptop. I don't really know what type of SSD is installed (and I don't bother too much to check it out). If it matters, or anyone wants to and knows how to research, the laptop is a Surface Book 3.
Conclusion
- Beefy PC:
B)
is faster thanA)
, against expectations. - Laptop:
A)
is faster thanB)
, within expectations.
My best guess is that some sort of hidden parellization is at work. Either the CPU does smart things, the SSD does very smart things, or they work together to do incredibly smart things. But pinning and writing down anything further sounds too absurd for me to keep it staying here.
What explains the difference in my expectation and the results?
The benchmark
Check out https://github.com/rphii/Rlib, under examples/writecomp.c
More Text
I noticed this effect while working on my beefy PC with a string of length ~25MB. Since B)
was a marginal, but consistent, ~4ms faster than A)
, I increased the string length and did a more thorough test.
CodePudding user response:
Since no one's gonna do it, I'll answer my question based on the comment I got.
- clock does not measure the wall clock time but the CPU time. Please read this post.
- Reads/writes are generally buffered.
- Operating systems generally uses an in-memory cache (especially for HDD).
- SSD reads can be faster in parallel (and often are for recent ones) while HDD are almost never faster in parallel. (this quite recent post provides some information about caching and buffering).