Home > OS >  Why is fs::read_dir() thread safe on POSIX platforms
Why is fs::read_dir() thread safe on POSIX platforms

Time:10-22

Some Background

Originally, Rust switched from readdir(3) to readdir_r(3) for thread safety. But readdir_r(3) has some problems, then they changed it back:

So, in the current implementation, they use readdir(3) on most POSIX platforms

#[cfg(any(
    target_os = "android",
    target_os = "linux",
    target_os = "solaris",
    target_os = "fuchsia",
    target_os = "redox",
    target_os = "illumos"
))]
fn next(&mut self) -> Option<io::Result<DirEntry>> {
unsafe {
    loop {
    // As of POSIX.1-2017, readdir() is not required to be thread safe; only
    // readdir_r() is. However, readdir_r() cannot correctly handle platforms
    // with unlimited or variable NAME_MAX.  Many modern platforms guarantee
    // thread safety for readdir() as long an individual DIR* is not accessed
    // concurrently, which is sufficient for Rust.
    super::os::set_errno(0);
    let entry_ptr = readdir64(self.inner.dirp.0);
Thread issue of readdir(3)

The problem of readdir(3) is that its return value (struct dirent *) is a pointer pointing to the internal buffer of the directory stream (DIR), thus can be overwritten by the following readdir(3) calls. So if we have a DIR stream, and share it with multiple threads, with all threads calling readdir(3), a race condition may happen.

If we want to safely handle this, an external synchronization is needed.

My question

Then I am curious about what Rust did to avoid such issues. Well, it seems that they just call readdir(3), memcpy the return value to their caller-allocated buffer, and then return. But this function is not marked as unsafe, this makes me confused.

So my question is why is it safe to call fs::read_dir() in multi-threaded programs?

There is a comment stating that it is safe to use it in Rust without extra external synchronization, but I didn't get it...

It requires external synchronization if a particular directory stream may be shared among threads, but I believe we avoid that naturally from the lack of &mut aliasing. Dir is Sync, but only ReadDir accesses it, and only from its mutable Iterator implementation.

CodePudding user response:

readdir is not safe when called from multiple threads with the same DIR* dirp parameter (i.e. with the same self.inner.dirp.0 in the Rust case) but it may be called safely with different dirps. Since calling ReadDir::next requires a &mut self, it is guaranteed that nobody else can call it from another thread at the same time on the same ReadDir instance, and so it is safe.

  • Related