Home > Enterprise >  Combining csv files in each independent folder and saving the combine file in the independent folder
Combining csv files in each independent folder and saving the combine file in the independent folder

Time:12-02

I wish to loop through a number of folders and concatenate all the .csv files, and output a combined[folder name].csv for EACH folder, via a batch file. For example, for Folder1, the output concatenated file in that folder will be combinedFolder1.csv.How would i go about doing that? i can do it for each individual folder but would like to do batch processing Thanks i have attached a picture and am also adding a code for individual folder

joined_files = os.path.join("C:/Users/user/Desktop/Main_folder/folder1/", "*.csv")
joined_list = glob.glob(joined_files)
df = pd.concat(map(pd.read_csv, joined_list), ignore_index=True)
df.to_csv("folder1.csv",index=False)

enter image description here

CodePudding user response:

You can use something like this:

import pandas as pd
import pathlib

main_folder = pathlib.Path('data')
data_folders = [d for d in main_folder.iterdir() if d.is_dir()]

for data_folder in data_folders:
    data = [pd.read_csv(csvfile) for csvfile in data_folder.glob('*.csv')]
    pd.concat(data).to_csv(data_folder / f"{data_folder.name}.csv")

Folder structure:

data
├── Folder1
│   ├── file1.csv
│   ├── file2.csv
│   └── Folder1.csv
└── Folder2
    ├── file1.csv
    ├── file2.csv
    └── Folder2.csv

2 directories, 6 files

CodePudding user response:

You can use this, it works perfectly fine with what you want to do:

import glob, os
import pandas as pd
for i in my_list:
    directory_name="C:/Users/user/Desktop/Main_folder/{0}".format(i)
    os.chdir(directory_name)
    extension = 'csv'
    all_csvs = [i for i in glob.glob('*.{}'.format(extension))]
    combined_csv = pd.concat([pd.read_csv(f) for f in all_csvs ])
    combined_csv.to_csv( ("{0}.csv".format(i)), index=False, encoding='utf-8-sig')
  • Related