I try to create function in Python Pandas where:
- I read 5 csv
- make some aggregations on each readed csv (just to make it easier, we can delete one column)
- save each modified csv as DataFrames
Currently I have something like below, nevertheless it return only one DataFrame as output not 5, how can I change below code ?
def xx():
#1. read 5 csv
for el in [col for col in os.listdir("mypath") if col.endswith(".csv")]:
df = pd.read_csv("path/f"{el}"")
#2. making aggregations
df = df.drop("COL1", axis=1)
#3. saving each modified csv to separated DataFrames
?????
FInally I need to have 5 separated DataFrames after modifications, how can I modify my function to achieve taht in Phython Pandas ?
CodePudding user response:
You can create an empty dictionnary and feed it gradually with the five processed dataframes.
Try this:
def xx():
dico_dfs={}
for el in [file for file in os.listdir("mypath") if file.endswith(".csv")]:
#1. read 5 csv
df = pd.read_csv(f"path/{el}")
#2. making aggregations
df = df.drop("COL1", axis=1)
#3. saving each modified csv to separated DataFrames
dico_dfs[el]= df
You can access to each dataframe by using the filename as a key, e.g dico_dfs["file1.csv"]
.
If needed, you can make a single dataframe by using pandas.concat
: pd.concat(dico_dfs)
.