Some Background
Originally, Rust switched from readdir(3)
to readdir_r(3)
for thread safety. But readdir_r(3)
has some problems, then they changed it back:
- Linux and Android: fs: Use readdir() instead of readdir_r() on Linux and Android
- Fuchsia: Switch Fuchsia to readdir (instead of readdir_r)
- ...
So, in the current implementation, they use readdir(3)
on most POSIX platforms
#[cfg(any(
target_os = "android",
target_os = "linux",
target_os = "solaris",
target_os = "fuchsia",
target_os = "redox",
target_os = "illumos"
))]
fn next(&mut self) -> Option<io::Result<DirEntry>> {
unsafe {
loop {
// As of POSIX.1-2017, readdir() is not required to be thread safe; only
// readdir_r() is. However, readdir_r() cannot correctly handle platforms
// with unlimited or variable NAME_MAX. Many modern platforms guarantee
// thread safety for readdir() as long an individual DIR* is not accessed
// concurrently, which is sufficient for Rust.
super::os::set_errno(0);
let entry_ptr = readdir64(self.inner.dirp.0);
Thread issue of readdir(3)
The problem of readdir(3)
is that its return value (struct dirent *
) is a pointer pointing to the internal buffer of the directory stream (DIR
), thus can be overwritten by the following readdir(3)
calls. So if we have a DIR
stream, and share it with multiple threads, with all threads calling readdir(3)
, a race condition may happen.
If we want to safely handle this, an external synchronization is needed.
My question
Then I am curious about what Rust did to avoid such issues. Well, it seems that they just call readdir(3)
, memcpy
the return value to their caller-allocated buffer, and then return. But this function is not marked as unsafe
, this makes me confused.
So my question is why is it safe to call fs::read_dir()
in multi-threaded programs?
There is a comment stating that it is safe to use it in Rust without extra external synchronization, but I didn't get it...
It requires external synchronization if a particular directory stream may be shared among threads, but I believe we avoid that naturally from the lack of
&mut
aliasing.Dir
isSync
, but onlyReadDir
accesses it, and only from its mutableIterator
implementation.
CodePudding user response:
readdir
is not safe when called from multiple threads with the same DIR* dirp
parameter (i.e. with the same self.inner.dirp.0
in the Rust case) but it may be called safely with different dirp
s. Since calling ReadDir::next
requires a &mut self
, it is guaranteed that nobody else can call it from another thread at the same time on the same ReadDir
instance, and so it is safe.