Should I use os.path.exists before os.makedirs with exist_ok=True or not?
It doesn't matter if I don't have permission to that folder.
From my testing it seems roughly 3 times as fast to check the file path first and it makes my code better readable. Or am I missing something?
import timeit
import os
def mkdirs(file_path):
if not os.path.exists(file_path):
os.makedirs(file_path, exist_ok = True)
file_path = "/Users/wannes/Downloads/untitled folder"
print("#0", timeit.timeit('''mkdirs(file_path)''', globals=globals(), number=100000))
print("#1", timeit.timeit('''os.makedirs(file_path, exist_ok = True)''', globals=globals(), number=100000))
#0 0.28453361399806454
#1 0.8769928680012526
This is used to extract & convert files from a zip which means I need to do this check for every file as I can't get a list of folders. Of course then this makes a difference.
CodePudding user response:
According to your microbenchmark, the per-call additional overhead (when the directory already exists) is six microseconds. If you did this 1000 times, it would still be faster than a human can perceive. If the hot path in your code is creating directories that might already exist, something is very wrong; if it's not the hot path, the overhead is irrelevant. In your case, the work spent extracting/decompressing files from the zip file in the first place should outweigh any overhead of the simplest code (plain
os.makedirs(file_path, exist_ok=True)
) by multiple orders of magnitude. It's just not worth worrying about.This is no more readable because it replaces a documented, fairly well-known API call with something that's clearly in-house and may (or may not) do what I expect. If I see
os.makedirs(file_path, exist_ok=True)
I know exactly what it does, because I've used that API before, in multiple projects. If I seemkdirs(file_path)
I:a. Can guess at the meaning (and I'd be right), but
b. Can't trust that it's correct without looking up the definition of
mkdirs
(for one thing, I won't know if it's supposed to handle/ignore the case of the directory already existing), andc. I'll almost certainly go check, because reimplementing existing functionality like this is a form of code smell, and I'd want to know why someone didn't just do the straightforward thing with
os.makedirs
.
TL;DR: You're engaging in premature optimization. Writing a new wrapper for every call to save trivial amounts of processor time is not worth the trouble, and makes for less maintainable code. Just use the API directly until profiling proves microoptimizations matter in the real code.