Home > Net >  How to extract a category from describe.()
How to extract a category from describe.()

Time:01-11

import pandas as pd
df = {'fish': [38,10,23,45],'eggs': [24.0,10,12,8],'fruit': [80.5,60,12,32],'sugar': [0.234,0.8,0.34,0.76],
                  'category':['Price A', 'Price B','Price B','Price A']}
df = pd.DataFrame(data=df)
xy=['fish',
   'eggs',
   'fruit',
   'sugar',
   'category']
         
df=(df[xy].groupby(['category']).describe()).T
df['Ratio'] = df['Price A']/df['Price B']
df

I am trying to extract the category with only the mean with a ratio smaller than 0.5 and bigger than 2.

I tried to use

df.reset_index()

and making two separate df with the respective conditions, then turning the unwanted values to null, removing the null values(but I don't know how to specifically only choose mean's ratio), and then concatenating both of the df.

Maybe there are some better ways to do this?

CodePudding user response:

Use DataFrame.loc for select groups by categories:

m = df['Ratio'].xs('mean', level=1).between(0.5,2)
out = df.loc[m.index[~m]]

print(out)
category      Price A    Price B     Ratio
fish count   2.000000   2.000000  1.000000
     mean   41.500000  16.500000  2.515152
     std     4.949747   9.192388  0.538462
     min    38.000000  10.000000  3.800000
     25%    39.750000  13.250000  3.000000
     50%    41.500000  16.500000  2.515152
     75%    43.250000  19.750000  2.189873
     max    45.000000  23.000000  1.956522

Explanation:

Select levels mean by Series.xs:

print(df['Ratio'].xs('mean', level=1))
fish     2.515152
eggs     1.454545
fruit    1.562500
sugar    0.871930
Name: Ratio, dtype: float64

Create mask between 0.5 and 2 by Series.between:

print(df['Ratio'].xs('mean', level=1).between(0.5,2))
fish     False
eggs      True
fruit     True
sugar     True
Name: Ratio, dtype: bool

Get indices if Falses - invert mask and filter m.index:

print (m.index[~m])
Index(['fish'], dtype='object')
  • Related