Home > database >  Does an asynchronous operation reliably report success?
Does an asynchronous operation reliably report success?

Time:11-10

If I submbit an asynchronous I/O operation with overlapped I/O on Windows or with SIGIO handling under Linux, is the operation actually phyisically written to disk when the API reports success ? Or is it possible that the API just reports that the I/O request has only successfully submitted to disk ?
It could be nice to have sth. like lazy flushing to disk, i.e. you won't have to do an fsync() since you only need response on only a part of the I/Os you've submitted.

CodePudding user response:

As Synchronous and Asynchronous I/O and File Caching said,

This implies that read operations read file data from an area in system memory known as the system file cache, rather than from the physical disk. Correspondingly, write operations write file data to the system file cache rather than to the disk, and this type of cache is referred to as a write-back cache. Caching is managed per file object.

A thread performing asynchronous file I/O sends an I/O request to the kernel by calling an appropriate function.

Do not deallocate or modify the OVERLAPPED structure or the data buffer until all asynchronous I/O operations to the file object have been completed.

By default, Windows caches file data that is read from disks and written to disks except

  1. When large blocks of file data are read and written, it is more likely that disk reads and writes will be necessary to finish the I/O operation.
  2. In these situations, caching can be turned off. This is done at the time the file is opened by passing FILE_FLAG_NO_BUFFERING as a value for the dwFlagsAndAttributes parameter of CreateFile. When caching is disabled, all read and write operations directly access the physical disk. However, the file metadata may still be cached.
  3. Some applications, such as virus-checking software, require that their write operations be flushed to disk immediately; Windows provides this ability through write-through caching. A process enables write-through caching for a specific I/O operation by passing the FILE_FLAG_WRITE_THROUGH flag into its call to CreateFile as @RbMm said. With write-through caching enabled, data is still written into the cache, but the cache manager writes the data immediately to disk rather than incurring a delay by using the lazy writer.

Anyway, A process can also force a flush of a file it has opened by calling the FlushFileBuffers function.

CodePudding user response:

I made a little C 20 experiment with different combinations of FILE_FLAG_OVERLAPPED and FILE_FLAG_WRITE_THROUGH on Windows:

#include <Windows.h>
#include <iostream>
#include <system_error>
#include <chrono>
#include <memory>
#include <vector>

using namespace std;
using namespace chrono;

using XHANDLE = unique_ptr<void, decltype([]( void *h ) { h && h != INVALID_HANDLE_VALUE && CloseHandle( h ); })>;

int main( int argc, char **argv )
{
    char const *fileName = argc < 2 ? "hello.bin" : argv[1];
    try
    {
        for( int overlapped = 0; overlapped <= 1;   overlapped )
            for( int writeThrough = 0; writeThrough <= 1;   writeThrough )
            {
                auto throwSysErr = []( char const *what ) { throw system_error( (int)GetLastError(), system_category(), what ); };
                OVERLAPPED ol;
                XHANDLE xhOlEvt;
                if( overlapped )
                {
                    ol.Offset = 0;
                    ol.OffsetHigh = 0;
                    if( !(xhOlEvt = XHANDLE( CreateEvent( nullptr, FALSE, FALSE, nullptr ) )) )
                        throwSysErr( "event creation failed" );
                    ol.hEvent = xhOlEvt.get();
                }
                DWORD dwAttrs = (overlapped ? FILE_FLAG_OVERLAPPED : 0) | (writeThrough ? FILE_FLAG_WRITE_THROUGH : 0);
                XHANDLE xhFile( CreateFileA( fileName, GENERIC_WRITE, 0, nullptr, CREATE_ALWAYS, dwAttrs, NULL));
                if( xhFile.get() == INVALID_HANDLE_VALUE)
                    throwSysErr( "file creation failed" );
                auto ms = []<typename Fn>( Fn fn ) -> double
                {
                    auto start = high_resolution_clock::now();
                    fn();
                    return (double)(high_resolution_clock::now() - start).count() / 1.0e6;
                };
                vector<char> data( (size_t)1 << 24, '\0' );
                double
                    msSubmit = ms( 
                        [&]()
                        {
                            DWORD dwWritten;
                            if( !WriteFile( xhFile.get(), data.data(), (DWORD)data.size(), &dwWritten, overlapped ? &ol : nullptr )
                                && (!overlapped || GetLastError() != ERROR_IO_PENDING)
                                || !overlapped && dwWritten != data.size() )
                                throwSysErr( "write failed" );
                        } ),
                    msWait = overlapped ? ms( [&]() { while( WaitForSingleObject( xhOlEvt.get(), INFINITE) != WAIT_OBJECT_0 ); } ) : 0.0,
                    msFlush = ms(
                        [&]()
                        {
                            if( !FlushFileBuffers( xhFile.get() ) )
                                throwSysErr("flushing failed");
                        } ),
                    msSum = msSubmit   msWait   msFlush;
                xhFile.reset();
                cout << (!overlapped ? "synchronous, " : "overlapped, ") << (!writeThrough ? "write-cached:" : "write-through:") << endl;
                auto print = [&]( char const *header, double ms )
                {
                    cout << header << ms << "ms";
                    if( ms != msSum )
                        cout << " (" << trunc( 10'000.0 * ms / msSum   0.5 ) / 100.0 << "%)";
                    cout << endl;
                };
                print( "\tsumbit: ", msSubmit );
                if( msWait )
                    print( "\twait: ", msWait );
                print( "\tflush: ", msFlush );
                DeleteFileA( fileName );
            };
    }
    catch( system_error const &se )
    {
        cout << se.what() << endl;
    }
}

For my PCIe 3.0 SSD this prints:

synchronous, write-cached:
        sumbit: 7.2328ms (46.46%)
        flush: 8.3358ms (53.54%)
synchronous, write-through:
        sumbit: 14.0534ms (97.67%)
        flush: 0.3347ms (2.33%)
overlapped, write-cached:
        sumbit: 7.4661ms (48.6%)
        wait: 0.002ms (0.01%)
        flush: 7.8941ms (51.39%)
overlapped, write-through:
        sumbit: 0.4835ms (3.28%)
        wait: 13.9273ms (94.46%)
        flush: 0.3341ms (2.27%)

For my SATA-HDD this prints:

synchronous, write-cached:
        sumbit: 7.1667ms (4.64%)
        flush: 147.19ms (95.36%)
synchronous, write-through:
        sumbit: 76.1362ms (52.68%)
        flush: 68.378ms (47.32%)
overlapped, write-cached:
        sumbit: 9.7964ms (5.13%)
        wait: 0.0029ms (0%)
        flush: 181.184ms (94.87%)
overlapped, write-through:
        sumbit: 0.6646ms (0.46%)
        wait: 73.5184ms (51.26%)
        flush: 69.2524ms (48.28%)

What I'm wondering about is that with write-though on my HDD there's a large delay of the flush-operation. I'd have expected that either the write-operation itself with synchronous I/O or the wait-operation with overlapped I/O to fully complete the operation when writhe-through is enabled. It looks to me that write-through in that sense only means that the operation is submitted to the drive and success isn't reported at the point I expected.

And maybe this is interesting for some people here. This is what an 64GB USB stick reports:

synchronous, write-cached:
        sumbit: 655.56ms (98.05%)
        flush: 13.0484ms (1.95%)
synchronous, write-through:
        sumbit: 479.531ms (97.97%)
        flush: 9.9214ms (2.03%)
overlapped, write-cached:
        sumbit: 505.589ms (97.93%)
        wait: 0.0029ms (0%)
        flush: 10.7087ms (2.07%)
overlapped, write-through:
        sumbit: 483.08ms (98.23%)
        wait: 0.0014ms (0%)
        flush: 8.6801ms (1.77%)

I think this is a Windows behaviour not to delay writing on USB-sticks with WriteFile().

This is what an USB HDD reports:

synchronous, write-cached:
        sumbit: 114.604ms (56.23%)
        flush: 89.2258ms (43.77%)
synchronous, write-through:
        sumbit: 126.459ms (60.5%)
        flush: 82.5516ms (39.5%)
overlapped, write-cached:
        sumbit: 119.667ms (57.51%)
        wait: 0.0021ms (0%)
        flush: 88.4202ms (42.49%)
overlapped, write-through:
        sumbit: 0.7545ms (0.34%)
        wait: 123.911ms (56.43%)
        flush: 94.9324ms (43.23%)
  • Related