Home > Back-end >  Pandas dataframe - Groupby and drop groups based on multiple conditions in df
Pandas dataframe - Groupby and drop groups based on multiple conditions in df

Time:07-09

I have a dataframe as seen below (w/ more columns but these are the only relevant columns)

order_id    product_id  purchase_value
1234.       23546.0.    50.
1234.       23546.0     20.
5678.       43244.0.    25.

I am trying to groupby order_id but only get orders where the purchase value for a specific product_id is a set amount.

Something like this: groupby[order_id] where [product_id] = 23546, and [purchase_value] = 50

I've tried

df = df[df['order_id'].eq('product_id').groupby(df['order_id']).transform('any')]

This works to filter on one column but I can't seem to figure out how to make this apply to multiple columns

CodePudding user response:

In your solution chain mask by & for bitwise AND for get all groups if match at least one value:

mask = df['order_id'].eq(23546) & df['purchase_value'].eq(50)
df1 = mask.groupby(df['order_id']).transform('any')]

Or get all groups matching values and then filter original column order_id by Series.isin:

mask = df['order_id'].eq(23546) & df['purchase_value'].eq(50)
groups = df.loc[mask, 'order_id']
df1 = df[df['order_id'].isin(groups)]

CodePudding user response:

You can apply multiple filters to your data frame, and then use group by in the same line of code.

df1 = df[(df.product_id==23456) & (df.purchase_value==50)].groupby("order_id").transform('any')
  • Related