I was looking into an issue I had with Mender, where the installation progress (which is about copying a file on a block device) is not reported correctly.
My feeling is that it's about the kernel page cache: the progress bar says "100%" when the code has read the whole image, but that does not mean that the kernel is done writing it.
More specifically, Mender calls n, err := io.Copy(dev, image)
, which returns after the kernel is done writing. But the progress bar is linked to the "image" Reader
, which is fully read tens of seconds before io.Copy
returns.
Because the file is opened with flags here, I naively thought that I just had to set flag |= O_SYNC
, so that io.Copy(dev, image)
would not read image
faster than it writes to dev
.
But setting O_SYNC
does not make a difference.
It is not clear to me if O_SYNC
is merely a hint (so I cannot count on it), if it could be that I am missing something on my device (say, I forgot a kernel option on my Raspberry Pi and therefore O_SYNC
is useless), or if I just misunderstood what O_SYNC
does?
EDIT: I also tried to set O_SYNC | O_DIRECT
(though O_DIRECT
is apparently not exposed in Go and so I did O_SYNC | 0o40000
), but I got the following error when opening the block device:
Opening device: /dev/mmcblk0p2 for writing with flag: 1069058
Failed to open the device: "/dev/mmcblk0p2": open /dev/mmcblk0p2: not a directory
CodePudding user response:
Summarizing the comments:
The main issue is that the progress bar is decorating the reader (as Yotam Salmon noted), not the writer; the delay is on the side of the writer.
On most Linux systems,
O_DIRECT
is indeed0o40000
, but on ARM (including Raspberry Pi) it is0o200000
, with0o40000
beingO_DIRECTORY
. This explains the "not a directory" error.O_SYNC
is in fact the bit you want, or you can simply issue anfsync
system call (useFlush
if appropriate, and thenSync
, as noted in When to flush a file in Go?). TheO_SYNC
bit implies anfsync
system call as part of eachwrite
system call.
Fully synchronous I/O is a bit of a minefield: some devices lie about whether they've written data to nonvolatile storage. However, O_SYNC
or fsync
is the most guarantee you'll get here. O_DIRECT
is likely irrelevant since you're going directly to a device partition /dev
file. O_SYNC
or fsync
may be passed through to the device driver, which may do something with it, which may get the device to write to nonvolatile storage. There's more about this in What does O_DIRECT really mean?