Home > Net >  python grouping filename and read csv by group and export csv by key
python grouping filename and read csv by group and export csv by key

Time:07-28

My request is to group filename with similar character, and read and merge all csv files by group, then export csv by key. Here's my code:

import glob
import pandas as pd

path = r'C:\Users\*.csv'

filelist = glob.glob(path)

dictionary = {}

for x in filelist:
    group = dictionary.get(x.split('_')[3],[])
    group.append(x)
    dictionary[x.split('_')[3]] = group

all_df = []

for value in dictionary.items():
    df = pd.read_csv(value)
    all_df.append(df)
    concat_df = pd.concat(all_df, axis=0)
    concat_df.to_csv('value', index = False)

If I print(value), I get a dictionary below

('AAA', ['C:\Users\2022_0727_0911_AAA_abc.csv', 'C:\Users\2022_0724_2111_AAA_def.csv', 'C:\Users\2022_0725_0357_AAA_ghi.csv'])
('BBB', ['C:\Users\2022_0727_0911_BBB_abc.csv', 'C:\Users\2022_0724_2111_BBB_def.csv'])
('CCC', ['C:\Users\2022_0727_0911_CCC_abc.csv', 'C:\Users\2022_0724_2111_CCC_def.csv', 'C:\Users\2022_0725_0357_CCC_ghi.csv'])

I want to merge all csv files by same group, for example, concatenate 3 files for AAA group, 2 files for BBB group, etc. Then export a csv file naming by key, such as AAA.csv and BBB.csv. Below code is incorrect, but basically it's the concept I want to achieve. Can someone help with the codes below:

for value in dictionary.items():
    df = pd.read_csv(value)
    all_df.append(df)
    concat_df = pd.concat(all_df, axis=0)
    concat_df.to_csv('value', index = False)

Added example csv as below, all csv files have same columns.

Dataframe

Date Time Name Result
3/28/2022 7:00:00 abc 0.23
3/28/2022 7:00:00 abc 0.19
3/28/2022 7:05:00 def 0.36
3/28/2022 7:10:00 ghi 0.29

CodePudding user response:

I think that you need to create an empty dataframe for all the data in the same category (AAA, BBB, CCC). You can do something like that :

for key, list_value in dictionary.items():
    all_df = pd.DataFrame()
    for file in list_value:
        df = pd.read_csv(file)
        all_df = pd.concat([all_df, df] , axis=0)
    all_df.to_csv(f"{key}.csv", index = False)
  • Related