Home > Enterprise >  Python append multiple Excel files
Python append multiple Excel files

Time:03-22

I'm trying to append multiple Excel files with same columns into one. If I use this code x.append(y, ignore_index = True) it doesn't work. At the end of the for loop it returns only table of x. However if I try to run x.append(y, ignore_index = True) in a single code block it works fine to append y which is still in the memory after for loop. I am using Juptyer Notebook.

# import required module
import os as os
import pandas as pd
# assign directory
# directory = 'C:\\Users\\Tomas\\Documents\\Python Scripts\\csv\\TimeLogs'
directory = 'C:\\Users\\Tomas\\Documents\\Python Scripts\\csv\\tmp'
 
# iterate over files in
# that directory
for idx,filename in enumerate(os.listdir(directory)):
    f = os.path.join(directory, filename)
    # checking if it is a file
    if os.path.isfile(f):
        print(f)
        print(idx)
        if idx == 0:
            x = pd.read_excel(f,engine="openpyxl")
        else:
            y = pd.read_excel(f,engine="openpyxl")
            x.append(y, ignore_index = True)

CodePudding user response:

You can create a list of dataframes and then concatenate them using pandas' concat method.

# import required module
import os as os
import pandas as pd
# assign directory
# directory = 'C:\\Users\\Tomas\\Documents\\Python Scripts\\csv\\TimeLogs'
directory = 'C:\\Users\\Tomas\\Documents\\Python Scripts\\csv\\tmp'
 
# iterate over files in
# that directory
list_of_dataframes=[]
for idx,filename in enumerate(os.listdir(directory)):
    f = os.path.join(directory, filename)
    # checking if it is a file
    if os.path.isfile(f):
        print(f)
        list_of_dataframes.append(pd.read_excel(f,engine="openpyxl"))

merged_df=pd.concat(list_of_dataframes)

This way, you do not have to check whether the index idx is equal to 0.

  • Related