Home > Enterprise >  How to read and modify csv files in function in loop and save as separated DataFrame in Python Panda
How to read and modify csv files in function in loop and save as separated DataFrame in Python Panda

Time:11-21

I try to create function in Python Pandas where:

  1. I read 5 csv
  2. make some aggregations on each readed csv (just to make it easier, we can delete one column)
  3. save each modified csv as DataFrames

Currently I have something like below, nevertheless it return only one DataFrame as output not 5, how can I change below code ?

def xx():
    #1. read 5 csv 
    for el in [col for col in os.listdir("mypath") if col.endswith(".csv")]:
    df = pd.read_csv("path/f"{el}"")
    
    #2. making aggregations
    df = df.drop("COL1", axis=1)

    #3. saving each modified csv to separated DataFrames
     ?????

FInally I need to have 5 separated DataFrames after modifications, how can I modify my function to achieve taht in Phython Pandas ?

CodePudding user response:

You can create an empty dictionnary and feed it gradually with the five processed dataframes.

Try this:

def xx():
    dico_dfs={}

    for el in [file for file in os.listdir("mypath") if file.endswith(".csv")]:
        #1. read 5 csv 
        df = pd.read_csv(f"path/{el}")

        #2. making aggregations
        df = df.drop("COL1", axis=1)

        #3. saving each modified csv to separated DataFrames
        dico_dfs[el]= df

You can access to each dataframe by using the filename as a key, e.g dico_dfs["file1.csv"].

If needed, you can make a single dataframe by using pandas.concat : pd.concat(dico_dfs).

  • Related