Home > Back-end >  Read and concat multiple excel files based on a specific column
Read and concat multiple excel files based on a specific column

Time:11-11

path = '/Desktop/somefolder'
for filename in os.listdir(path):
    with open(path filename) as f:
         - read the 3-4 excel files and attach the path
         - be able to concat them based on a specific column

filename gives me the name of the file I have in the directory. My idea was to concat the filename with the path to be able to read and concat them.

I am not sure how to use the filename that I get to be able to load it as a df and concat it.

CodePudding user response:

Try:

import pandas as pd
import os

path = '/Desktop/somefolder/'

dfs = []
for filename in os.listdir(path):
    dfs.append(pd.read_excel(path filename, engine='openpyxl'))

pd.concat(dfs, axis=0)

CodePudding user response:

Read the data into data frames

df1 = pd.read_excel('file1.xlsx')
df2 = pd.read_excel('file2.xlsx') 

Create filtered data frames

df1Filtered = df1[df1["YourColumnName"].("YourColumnValues")
df2Filtered = df2[df2["YourColumnName"].("YourColumnValues")

Concat the filtered data frames

NewDF = pd.concat([df1Filtered, df2Filtered])

Write a new file to excel

NewDF.to_excel('NewFile.xlsx')
  • Related