How do I get datetime based on files in my directory that starts with a certain name?-CodePudding

My code -

for file in os.listdir("C:/Users/hhh/Desktop/autotranscribe/python/Matching"):
    if file.startswith("Master"):
        dfs = pd.read_excel(file, sheet_name=None)
        output = dict()
        for ws, df in dfs.items():
            if ws in ["Added"]:
               continue
            if ws in ["All Members", "Terms"]:
               temp = df

>temp
Company State Address Zip    City
ABC      CA   1 st    86284  LA

I want another column that denotes the month the file was created/dropped in the directory minus 1(File Date minus 1 month). So basically, if the file was created on 5/5/22. I want the below

Company State Address Zip    City  File Month
ABC      CA   1 st    86284  LA    04-2022

My attempt-

import datetime
path = "C:/Users/hhh/Desktop/autotranscribe/python/Matching"
# file creation timestamp in float
c_time = os.path.getctime(path)
# convert creation timestamp into DateTime object
dt_c = datetime.datetime.fromtimestamp(c_time)
temp['File Month'] = (dt_c.replace(day=1) - datetime.timedelta(days=1)).strftime("%m-%Y")

But this only works if i set a path and not in my case because my path as you can see above is based on filename that startwith 'Master'. I only want it to be based on the filename and not on os path because os path has multiple files created in it on a daily basis. How do I do this?

CodePudding user response：

Move your code for changing the time into your for loop. Also, you don't need datetime. pandas has all the operations you need.

path = "C:/Users/hhh/Desktop/autotranscribe/python/Matching"
for file in os.listdir(path):
    if file.startswith("Master"):
        dfs = pd.read_excel(file, sheet_name=None)
        output = dict()
        for ws, df in dfs.items():
            if ws in ["Added"]:
               continue
            if ws in ["All Members", "Terms"]:
               temp = df
               dt = pd.to_datetime(os.path.getctime(os.path.join(path,file)),unit="s").replace(nanosecond=0)
               temp['File Month'] = (dt.replace(day=1)-pd.DateOffset(days=1)).strftime("%m-%Y")