I have a dataframe df1
user_id
0 101
1 102
2 103
3 101
and another df2
user_id type
0 101 cash
1 101 card
2 102 card
3 102 card
4 103 cash
5 103 cash
6 103 card
There are more than 2 types.
I want to add a columns to df1
that is the ratio of each type for every user_id
. So I want the df1
to look like:
df1
user_id cash_pct card_pct
0 101 0.5 0.5
1 102 0 1.0
2 103 0.67 0.33
3 101 0.5 0.5
I think I should use groupby, but beyond that I'm not sure how I should do about this
CodePudding user response:
Use crosstab
with normalize
parameter, rename columns names by DataFrame.add_suffix
and add DataFrame.round
, add to df1
by DataFrame.join
:
df = (df1.join(pd.crosstab(df2['user_id'], df2['type'], normalize='index')
.add_suffix('_pct').round(2), on='user_id'))
print (df)
user_id card_pct cash_pct
0 101 0.50 0.50
1 102 1.00 0.00
2 103 0.33 0.67
3 101 0.50 0.50