Home > Blockchain >  How to store results from a for loop in a data frame and export to Excel?
How to store results from a for loop in a data frame and export to Excel?

Time:02-03

I'm working on writing a Python program that retrieves all file names and their modified date in a specific folder and then exporting that list to Excel. I have been able to do the first part (retrieve file names and modified dates from a folder). I am having a lot of trouble storing the results (all the file names dates) in a data frame. I'm trying to get the results in a data frame because it's my understanding that you need to have it in one before exporting to Excel.

The below program almost gives me what I need but the data frame is only being created from the last record of the loop.

path = 'Insert Path'

def ts_to_dt(ts):
    return datetime.datetime.fromtimestamp(ts)

for file in os.scandir(path):
    if (file.name.endswith(".pdf") or file.name.endswith(".PDF")):
        results = {'FileName':file.name, 'DateModified':ts_to_dt(file.stat().st_atime)}

df=pd.DataFrame(data=results,index=[0])
print(df)

I'm sure there is something obvious that I'm missing, but I am new to Python so any help is appreciated.

CodePudding user response:

your results variable is getting reinitialized in every iteration of the loop, rather than you adding items to it. That is why you are getting only the last record. You can declare results outside of the loop and then append to those. Then, exporting to excel is easy!

path = 'Insert Path'

def ts_to_dt(ts):
    return datetime.datetime.fromtimestamp(ts)

results = {'FileName': [], 'DateModified': []}
for file in os.scandir(path):
    if (file.name.endswith(".pdf") or file.name.endswith(".PDF")):
        results['FileName'].append(file.name)
        results['DateModified'].append(ts_to_dt(file.stat().st_atime)
        

df=pd.DataFrame(data=results,index=[0])

# export it to excel:
# first argument is the output excel name (can include a path)
# second argument is what you want the sheet to be named.
df.to_excel('output.xlsx', 'sheetname')
  • Related