Home > OS >  quantile groupby pandas dataframe
quantile groupby pandas dataframe

Time:10-19

I have below pandas dataframe. I want to create a new column that would give me 75% quantile rate groped by State and County

below code gives me only 75% quantile rate as output, i want to create a new column with 75% quantile rate in the existing df

df = df.groupby('State')['rate'].quantile(0.75)

State county rate
AK a 27.5
AK a 30.5
GA a 35.5
GA b 50.0
GA b 75.5
GA c 90.5
AR a 16.5
AR a 10.5

CodePudding user response:

Use:

df['q'] = df.groupby('State')['rate'].transform(lambda x: x.quantile(0.75))

If need also county:

df['q'] = df.groupby(['State', 'county'])['rate'].transform(lambda x: x.quantile(0.75))
print (df)
  State county  rate       q
0    AK      a  27.5  29.750
1    AK      a  30.5  29.750
2    GA      a  35.5  35.500
3    GA      b  50.0  69.125
4    GA      b  75.5  69.125
5    GA      c  90.5  90.500
6    AR      a  16.5  15.000
7    AR      a  10.5  15.000

EDIT: If possible some missing values:

df['q'] = (df.fillna({'State':'missing', 'county':'missing'})
            .groupby(['State', 'county'])['rate'].transform(lambda x: x.quantile(0.75)))

CodePudding user response:

Or with pd.MultiIndex.from_frame and pd.MultiIndex.map:

df['new'] = pd.MultiIndex.from_frame(df[['State', 'county']]).map(df.groupby(['State', 'county'])['rate'].quantile(0.75))

And now:

>>> df
  State county  rate     new
0    AK      a  27.5  29.750
1    AK      a  30.5  29.750
2    GA      a  35.5  35.500
3    GA      b  50.0  69.125
4    GA      b  75.5  69.125
5    GA      c  90.5  90.500
6    AR      a  16.5  15.000
7    AR      a  10.5  15.000
>>> 
  • Related