I have a dataframe like below and need to create (1) a new dataframe for each unique date and (2) create a new global variable with the date of the new dataframe as the value. This needs to be in a loop.
Using the dataframe below, I need to iterate through 3 new dataframes, one for each date value (202107, 202108, and 202109). This loop occurs within an existing function that then uses the new dataframe and its respective global variable of each iteration in further calculations. For example, the first iteration would yield a new dataframe consisting of the first two rows of the below dataframe and a value for the new global variable of "202107." What is the most straightforward way of doing this?
Date | Col1 | Col2 |
---|---|---|
202107 | 1.23 | 6.72 |
202107 | 1.56 | 2.54 |
202108 | 1.78 | 7.54 |
202108 | 1.53 | 7.43 |
202108 | 1.58 | 2.54 |
202109 | 1.09 | 2.43 |
202109 | 1.07 | 5.32 |
CodePudding user response:
Loop over the results of .groupby
:
for _, new_df in df.groupby("Date"):
print(new_df)
print("-" * 80)
Prints:
Date Col1 Col2
0 202107 1.23 6.72
1 202107 1.56 2.54
--------------------------------------------------------------------------------
Date Col1 Col2
2 202108 1.78 7.54
3 202108 1.53 7.43
4 202108 1.58 2.54
--------------------------------------------------------------------------------
Date Col1 Col2
5 202109 1.09 2.43
6 202109 1.07 5.32
--------------------------------------------------------------------------------
Then you can store new_df
to a list or a dictionary and use it afterwards.
CodePudding user response:
You can extract the unique date values y the .unique()
method, and then store your new dataframes and dates in a dict
to access then easily like :
unique_dates = init_df.Date.unique()
df_by_date = {
str(date): init_df[init_df['Date'] == date] for date in unique_dates
}
you use the dict like :
for date in unique_dates:
print(date, ': \n', df_by_date[str(date)])
output:
202107 :
Date Col1 Col2
0 202107 1.23 6.72
1 202107 1.56 2.54
202108 :
Date Col1 Col2
2 202108 1.78 7.54
3 202108 1.53 7.43
4 202108 1.58 2.54
202109 :
Date Col1 Col2
5 202109 1.09 2.43
6 202109 1.07 5.32