I have a following dataframe
I would like to group by id and add a flag column which contains Y if anytime Y has occurred against id ,resultant DF would like
Here is my approach which is too time consuming and not sure of correctness
temp=pd.DataFrame()
j='flag'
for i in df['id'].unique():
test=df[df['id']==i]
test[j]=np.where(np.any((test[j]=='Y')),'Y',test[j])
temp=temp.append(test)
CodePudding user response:
Compare flag
to Y
, group by id
, and use any
:
new_df = (df['flag'] == 'Y').groupby(df['id']).any().map({True:'Y', False:'N'}).reset_index()
Output:
>>> new_df
id flag
0 1 Y
1 2 Y
2 3 N
3 4 N
4 5 Y
CodePudding user response:
You can do groupby max
since Y > N
:
df.groupby('id', as_index=False)['flag'].max()
id flag
0 1 Y
1 2 Y
2 3 N
3 4 N
4 5 Y