I'm working with a DataFrame which contains columns like ['Product Name','Sales','ID' ...] so I'm grouping by Product Name and then computing the Standard Deviation Population for the column 'Sales' in order to filter the records using that value.
I tried with this solution
groupedDF.filter(lambda x: x['Sales'].agg(np.std,ddof = 0) != 0 )
But it's returning this error
TypeError: filter function returned a Series, but expected a scalar bool
CodePudding user response:
df = pd.DataFrame({'ID':[1,1,2,2,3,3],
'YEAR' : [2011,2012,2012,2013,2013,2014],
'V': [0,1,1,0,1,0],
'C':[0,11,22,33,44,55]})
# This returns a std for every single value
print(df.groupby('ID').apply(lambda x: x['C'].agg(np.std, ddof=0) !=0))
# This returns and std for every group
print(df.groupby('ID').apply(lambda x: np.std(x['C'], ddof=0)!=0))
# This is probably what you want
print(df.groupby('ID').filter(lambda x: np.std(x['C'], ddof=0)!=0))