i want filter all data on condition type
have contains() or subset() 'NCO - ETD' follow groupby date and id.
I wrote this code:
cond = 'NCO - ETD'
mask = data.groupby(['Date','Id'])['Type'].agg(set).apply(lambda x: any(x.issubset(cond)))
but TypeError: 'bool' object is not iterable
CodePudding user response:
If need subset use list
from cond
and remove apply
with any
:
mask = data.groupby(['Date','Id'])['Type'].agg(lambda x: set(x).issubset([cond]))
Or if need test substring create helper column and then test it at least one True
per groups by any
:
cond = 'NCO - ETD'
mask = (data.assign(new = data['Type'].str.contains(cond))
.groupby(['Date','Id'])['new']
.transform('any'))