Can I use groupby
and elegantly check if a condition is true (e.g. the group contains a given value) in a single expression?
Currently, I use for
df = pd.DataFrame([['1', 'A'], ['1', 'A'], ['1', 'B'], ['2', 'B']], columns=['col_1', 'col_2'])
apply()
to get what I want:
df.groupby('col_1').col_2.apply(lambda s: 'A' in s.values) # or other conditions
But this is neither efficient nor elegant. Other ideas?
CodePudding user response:
First compare values of col_2
for boolean mask and then aggregate GroupBy.any
:
s = df.col_2.eq('A').groupby(df['col_1']).any()
print (s)
col_1
1 True
2 False
Name: col_2, dtype: bool
Or create helper column new
:
s = df.assign(new = df.col_2.eq('A')).groupby('col_1')['new'].any()
print (s)
col_1
1 True
2 False
Name: new, dtype: bool