How to stack data from excel files using python-CodePudding

I have a folder with an increasing number of excel files containing daily data. I want to create a dataframe that expands as the number of files increase. The new data must stack under the existing data, to extend the time-series and then save the dataframe to a new excel output file. I have this code so far. Some parts work and I do get an output file, however it is stacked in random order and not in the order from the folder/directory.

# loop over excel files in cwd                                              
 all_data =  pd.DataFrame() # create empty list                             
 list = []                                                                  
 for f in glob.glob("/Users/xxxx/xxxx/xxxx/Data/*.xlsx"):    
df = pd.read_excel(f, sheet_name= 'data')                              
    list.append(df)                                                                                                          
                                                                           
 all_data = pd.concat(list)
                                                
# save the data                                                            
 writer = pd.ExcelWriter('output.xlsx')                                     
 all_data.to_excel(writer, 'sheet1')                                       
 writer.save()

Thank you

CodePudding user response：

all_data =  pd.DataFrame() # create empty list
list = []
for f in glob.glob("/Users/xxxx/xxxx/xxxx/Data/*.xlsx"):
    df = pd.read_excel(f, sheet_name= 'data') 
    list.append(df)

all_data = pd.concat(list)
all_data.to_excel("/Users/xxxx/xxxx/xxxx/Data/all_data.xlsx")

CodePudding user response：

Assuming that you have Excel 365 and that all your files in the folder have the same structure, you can avoid vba altogether an instead do a Power Query specifying the folder as the datasource. The result will be a table that you can refresh every time you add a file to the folder.