Home > Software engineering >  How to filter dataframe by grouped value?
How to filter dataframe by grouped value?

Time:04-04

I have a dataframe, and what I need is to filter values by values that I gathered by grouping it.

For example, in the dataset below, after grouping I have two True and one False values, and what I need next is to filter original dataset, so only Mercedes and Audi (which are True) should be selected. Of course, I can just specify which rows I need, but in another dataset I have a lot of values, so it's quite a tedious task.

dff = pd.read_csv('https://raw.githubusercontent.com/codebasics/py/master/ML/5_one_hot_encoding/Exercise/carprices.csv')

dff.groupby('Car Model').sum()['Age(yrs)']>20

CodePudding user response:

The dff.groupby('Car Model').sum()['Age(yrs)']>20 gives you a boolean index, with which you need to access the original (grouped) dataframe. Example of usage:

>>> dff_grouped = dff.groupby('Car Model').sum()
>>> dff_grouped[dff_grouped['Age(yrs)']>20]
                       Mileage  Sell Price($)  Age(yrs)
Car Model
Audi A5                 274000          92700        24
Mercedez Benz C class   288000          96000        25
  • Related