import pandas as pd
df = {'fish': [38,10,23,45],'eggs': [24.0,10,12,8],'fruit': [80.5,60,12,32],'sugar': [0.234,0.8,0.34,0.76],
'category':['Price A', 'Price B','Price B','Price A']}
df = pd.DataFrame(data=df)
xy=['fish',
'eggs',
'fruit',
'sugar',
'category']
df=(df[xy].groupby(['category']).describe()).T
df['Ratio'] = df['Price A']/df['Price B']
df
I am trying to extract the category with only the mean with a ratio smaller than 0.5 and bigger than 2.
I tried to use
df.reset_index()
and making two separate df with the respective conditions, then turning the unwanted values to null, removing the null values(but I don't know how to specifically only choose mean's ratio), and then concatenating both of the df.
Maybe there are some better ways to do this?
CodePudding user response:
Use DataFrame.loc
for select groups by categories:
m = df['Ratio'].xs('mean', level=1).between(0.5,2)
out = df.loc[m.index[~m]]
print(out)
category Price A Price B Ratio
fish count 2.000000 2.000000 1.000000
mean 41.500000 16.500000 2.515152
std 4.949747 9.192388 0.538462
min 38.000000 10.000000 3.800000
25% 39.750000 13.250000 3.000000
50% 41.500000 16.500000 2.515152
75% 43.250000 19.750000 2.189873
max 45.000000 23.000000 1.956522
Explanation:
Select levels mean
by Series.xs
:
print(df['Ratio'].xs('mean', level=1))
fish 2.515152
eggs 1.454545
fruit 1.562500
sugar 0.871930
Name: Ratio, dtype: float64
Create mask between 0.5
and 2
by Series.between
:
print(df['Ratio'].xs('mean', level=1).between(0.5,2))
fish False
eggs True
fruit True
sugar True
Name: Ratio, dtype: bool
Get indices if False
s - invert mask and filter m.index
:
print (m.index[~m])
Index(['fish'], dtype='object')