I was wondering if the Path.write_text(data)
function from pathlib was atomic or not.
If not, are there scenarios where we could end up with a file created in the filesystem but not containing the intended content?
To be more specific, as the comment from @ShadowRanger suggested what I care about is to know if the file contains either the original data or the new data, but never something in between. Which is actually less as full atomicity.
CodePudding user response:
On the specific case of the file containing the original data or the new data, and nothing in between:
No, it does not do any tricks with opening a temp file in the same directory, populating it, and finishing with an atomic rename to replace the original file. The current implementation is guaranteed to be at least two unique operations:
- Opening the file in write mode (which implicitly truncates it), and
- Writing out the provided data (which may take multiple system calls depending on the size of the data, OS API limitations, and interference by signals that might interrupt the write part-way and require the remainder to be written in a separate system call)
If nothing else, your code could die after step 1 and before step 2 (a badly timed Ctrl-C or power loss), and the original data would be gone, and no new data would be written.
Old answer in terms of general atomicity:
The question is kinda nonsensical on its face. It doesn't really matter if it's atomic; even if it was atomic, a nanosecond after the write occurs, some other process could open the file, truncate it, rewrite it, move it, etc. Heck, in between write_text
opening the file and when it writes the data, some other process could swoop in and move/rename the newly opened file or delete it; the open handle write_text
holds would still work when it writes a nanosecond later, but the data would never be seen in a file at the provided path (and might disappear the instant write_text
closes it, if some other process swooped in and deleted it).
Beyond that, it can't be atomic even while writing, in any portable sense. Two processes could have the same file open at once, and their writes can interleave (there are locks around the standard handles within a process to prevent this, but no such locks exist to coordinate with an arbitrary other process). Concurrent file I/O is hard; avoid it if at all possible.