I have a dataframe that looks like this:
df = pd.DataFrame([[201801, 0.5, 273.4, 'Fleet'], [201801, 0.34, 277.4, 'Drake'], [201801, 0.75, 255, 'Bay'], [201802, 0.97, 244.4, 'Fleet'], [201802, 0.54, 267.4, 'Drake'], [201802, 0.89, 235, 'Bay']], columns = ['time', 'windspeed', 'winddir', 'site_name'])
df
I want to create a dictionary of dataframes where the dictionary key is the site_name column and then the value is rest of the dataframe (i.e. the other 3 columns).
How can I do this please?
I do create this dataframe from combining 4 arrays into a dataframe earlier on, so if it is easier, then I could create the dictionary of dataframes from the arrays instead?
CodePudding user response:
GroupBy object, when iterated over, gives back the grouper keys along with the group dataframe; therefore a dictionary comprehension is possible. The grouper is also included in that subframe, so we drop that:
grouper = "site_name"
d = {name: sub_df.drop(columns=grouper) for name, sub_df in df.groupby(grouper)}
to get
>>> d
{"Bay": time windspeed winddir
2 201801 0.75 255.0
5 201802 0.89 235.0,
"Drake": time windspeed winddir
1 201801 0.34 277.4
4 201802 0.54 267.4,
"Fleet": time windspeed winddir
0 201801 0.50 273.4
3 201802 0.97 244.4}