I have a dataframe called temp
temp
Partner Zip Phone
VIP 002 267...
I have a script that goes through my os directory and adds data to these columns I wanted a new column called FileMonth where if lets say a file was dropped in the directory today 7/9/2022, the file month should indicate DateModified minus 1 month. in this case June(MM-YYYY)
Partner Zip Phone FileMonth
VIP 002 267.. 06-2022
Im currently doing-
temp['File Month'] = (dt.replace(day=1)-pd.DateOffset(days=1)).strftime("%m-%Y")
But since this is not based on datemodified in os directory, im getting most recent month for all files, which shouldnt be the case since some files were dropped in os even in april and may. How do I get temp['File Month'] to be the datemodified - 1 month as per directory.
CodePudding user response:
Given a list of file paths you can invoke os.path.getmtime()
to obtain the timestamp (in seconds) at which the file was last modified. .replace(month = ...)
allows you to decrease the date's month by one.
The code is given by
import os
from datetime import datetime
dateModList = []
for filePath in filePathList:
ts_mod = datetime.fromtimestamp(os.path.getmtime(filePath)) # extract modified timestamp
ts_mod = ts_mod.replace(month = ts_mod.month - 1) # shift by 1 month
date_mod = datetime.strftime(ts_mod, '%m-%Y') # to str
dateModList.append(date_mod)
# assign column
temp['File Month'] = dateModList