Say I have the following dataframe df
gender | country | age
------- --------- -----
M DK 20
M ES 19
M ES 21
M DK 30
F DK 20
F ES 19
F ES 21
F DK 12
Is there a way to get the Grouped data as a dataframe e.g
import pandas as pd
df_grp = df.group_by(["gender","country"])["age"] #Group it
df_grouped_df = pd.DataFrame(df_grp) #Convert to ordinary dataframe
print(df_grouped_df)
age
gender country
------ ----------
20
M DK 30
-------------
ES 19
21
---------------------
DK 20
F 30
------------
ES 19
21
EDIT:
Using set_index
does not seem to work, since it does not group the second columns specified (e.g country
);
df = pd.DataFrame({"gender":["M"]*4 ["F"]*4,"country":["DK","ES","ES","DK"]*2,"age":[1,2,3,4,5,6,7,8]})
gender country age
M DK 1
M ES 2
M ES 3
M DK 4
F DK 5
F ES 6
F ES 7
F DK 8
df.set_index(["gender","country"])
age
gender country
M DK 1
ES 2
ES 3
DK 4
F DK 5
ES 6
ES 7
DK 8
As noted it is not grouped for each country
as required
CodePudding user response:
Probably not clean but has the visual aspect:
out = df.sort_values(['gender', 'country'], ascending=[False, True]).reset_index(drop=True)
out.loc[out.duplicated(['gender', 'country']), 'country'] = ''
out.loc[out.duplicated('gender'), 'gender'] = ''
print(out.set_index(['gender', 'country'])
# Output:
age
gender country
M DK 20
30
ES 19
21
F DK 20
12
ES 19
21
CodePudding user response:
First groupby
in pandas is like iterator, so is necessary specify aggregate function(s) after it.
df1 = df.groupby(["gender","country"])["age"].sum().to_frame()
If need MultiIndex
:
df1 = df.set_index(["gender","country"]).sort_index()
EDIT: You can check docs:
We’ve “sparsified” the higher levels of the indexes to make the console output a bit easier on the eyes. Note that how the index is displayed can be controlled using the multi_sparse option in pandas.set_options():
df1 = df.set_index(["gender","country"]).sort_index()
print (df1)
age
gender country
F DK 5
DK 8
ES 6
ES 7
M DK 1
DK 4
ES 2
ES 3
with pd.option_context("display.multi_sparse", False):
print (df1)
age
gender country
F DK 5
F DK 8
F ES 6
F ES 7
M DK 1
M DK 4
M ES 2
M ES 3