I have a file path that I need to return the list of file names in it the last modified date each file in Python. I am able to do this except the problem is that I only want to do this for PDFs and there are multiples file types within the folder I'm working with. I have gotten to the point where I can get the file name and the modified date but run into an error when I try to put in the PDF only stipulation.
The below is what I have so far:
path = <insert path>
def ts_to_dt(ts):
return datetime.datetime.fromtimestamp(ts)
for file in os.scandir(path):
#if file.endswith(".pdf"):
print(file.name, ts_to_dt(file.stat().st_atime))
When I try to execute with the line that is commented out (if file.endswith(".pdf")), I get this error:
if file.endswith(".pdf"):
^^^^^^^^^^^^^
AttributeError: 'nt.DirEntry' object has no attribute 'endswith'
I'm new to Python so any help would be appreciated!
CodePudding user response:
os.scandir
returns an iterator of os.DirEntry
objects as you can see from the error message.
os.DirEntry
have different properties that you can get, including a name
property which is a String.
So you could do:
if file.name.endswith(".pdf"):
...
CodePudding user response:
I would recommend using pathlib for this instead.
You can do something like this:
from pathlib import Path
def print_access_times(path):
path = Path(path) # If path was either a str or Path object, this will work
for file in path.iterdir():
if file.suffix == '.pdf':
print(file.name, ts_to_dt(file.stat().st_atime))
The problem with your code was that os.scandir
yields DirEntry objects (which don't have a endswith
method), not strings. The Path
objects from pathlib
provide a more consistent interface for working with paths in Python.
You could have used file.name.endswith('.pdf')
with the os.scandir
solution instead.
To convert a Path object back into a string, you would use file.as_posix()
or str(file)
.
If you're not sure ahead of time whether the given path is a directory or not, then you can use the following to gracefully handle the error that would be raised:
def print_access_times(path):
path = Path(path) # If path was either a str or Path object, this will work
try:
for file in path.iterdir():
if file.suffix == '.pdf':
print(file.name, ts_to_dt(file.stat().st_atime))
except NotADirectoryError:
pass