Home > Enterprise >  Create new CSV files in python from the CSV files having same values in the column
Create new CSV files in python from the CSV files having same values in the column

Time:03-13

I am writing a python script to read multiple csv files from a folder. I need to merge the csv files according to the values present in one column (3rd column) and create new csv files from them.

For eg: I have 3 csv files like below:

 csv 1:          csv 2:        csv 3:

2  4  1994    4   4   1995    2   4   1997
2  4  1994    0   4   1995    2   4   1997
9  0  1994    9   6   1995    9   0   1998
6  7  1994    6   9   1996    6   7   1998
6  4  1995    4   2   1996    6   4   1998
6  3  1995    4   1   1997    6   3   1998
6  5  1995    0   5   1997    6   5   1999
8  7  1995    7   8   1997    8   7   1999

The output should look like:

 O/P 1:          O/P 2:        O/P 5:

2   4   1994      6   4   1995    9   0   1998
2   4   1994      6   3   1995    6   7   1998
9   0   1994      6   5   1995    6   4   1998
6   7   1994      8   7   1995    6   3   1998
           4   4   1995
           0   4   1995
           9   6   1995

 O/P 3:          O/P 4:        O/P 6:

6   9   1996      4   1   1997      6   5   1999
4   2   1996      0   5   1997      8   7   1999
           7   8   1997

I would really appreciate if someone me helps with this! Any suggestions will be really helpful.

Thanks.

CodePudding user response:

I used glob to automate this search for the .csv files, but if they were already initialized in some variable you can merge them and use only the part of for year in all_csv['YEAR'].unique()...

import pandas as pd
import glob

path = "/home/gustavo/" # path your csv files are (in my example is store in /home/gustavo)

all_csv = pd.DataFrame()
for file in glob.glob(path   "*.csv") : # *.csv search for all csv files in path
    all_csv = all_csv.append(pd.read_csv(file, header = None), ignore_index = True)

all_csv = all_csv.rename(columns = {0 : 'A', 1 : 'B', 2 : 'YEAR'})

# Filter for each year value and transform to csv 
for year in all_csv['YEAR'].unique() :
    all_csv[all_csv['YEAR'] == year].to_csv(f'{year}.csv', index=False)

Output :
enter image description here

  • Related