There's a producer which will download new files to a local file system (it will never modify existing files), and a consumer which periodically checks if there are new files and reads the new files.
I'm a bit confused about why the producer needs to lock the directory before creating the new file, and the consumer lock the directory when it is reading the new file, since no existing files will ever be modified so there should be no issue of a writer editing a file while the reader is reading.
Thanks
CodePudding user response:
Producing a file is not atomic. First, an empty file is created. Then its content is written to it, possibly over multiple writes. Each write modifies the file, so your claim the file is never modified is false.
A lock is being used to ensure that the consumer only picks up complete files.[1]
There are alternative approaches to locking that could be used. For example, the producer could create files suffixed with .tmp
, then rename (not copy) the file to the correct name when the file is complete. If the consumer ignores files ending in .tmp
, then it will only pick up complete files.[2]
Complete in the sense that the producer will never add to it again. But that doesn't mean the file contains all that it should. If the producer crashes, and if that automatically releases the lock, the consumer may receive only a part of the intended contents.
Truly complete. Not only will the producer never add to it again, but everything the producer intended to write to the file was written to it. (Short of a power failure at exactly the wrong moment or a similarly drastic event.)