I have a dataframe, and what I need is to filter values by values that I gathered by grouping it.
For example, in the dataset below, after grouping I have two True and one False values, and what I need next is to filter original dataset, so only Mercedes and Audi (which are True) should be selected. Of course, I can just specify which rows I need, but in another dataset I have a lot of values, so it's quite a tedious task.
dff = pd.read_csv('https://raw.githubusercontent.com/codebasics/py/master/ML/5_one_hot_encoding/Exercise/carprices.csv')
dff.groupby('Car Model').sum()['Age(yrs)']>20
CodePudding user response:
The dff.groupby('Car Model').sum()['Age(yrs)']>20
gives you a boolean index, with which you need to access the original (grouped) dataframe. Example of usage:
>>> dff_grouped = dff.groupby('Car Model').sum()
>>> dff_grouped[dff_grouped['Age(yrs)']>20]
Mileage Sell Price($) Age(yrs)
Car Model
Audi A5 274000 92700 24
Mercedez Benz C class 288000 96000 25