I need to create new column with True / False values. If condition is met based on other column, than True should be in every row associate with acc_no.
Original df:
trans_id acc_no trans_amount
1 123 15
2 456 10
3 123 50
4 456 60
5 789 35
6 123 20
7 789 18
8 456 73
9 789 37
output should be True for 123 acc_no only:
trans_id acc_no trans_amount ANY_50$_TRANSACTION ?
1 123 15 True
2 456 10 False
3 123 50 True
4 456 60 False
5 789 35 False
6 123 20 True
7 789 18 False
8 456 73 False
9 789 37 False
My code:
any_50$_transaction = []
for i in df["acc_no"].unique():
for ind in df[df["acc_no"] == i].index:
any_50$_transaction.append(np.where(df['trans_amount'][ind] == 50, True, False).any())
df["ANY_50$_TRANSACTION ?"] = any_50$_transaction
df
So far my output is:
trans_id acc_no trans_amount ANY_50$_TRANSACTION ?
1 123 15 False
2 456 10 False
3 123 50 True
4 456 60 False
5 789 35 False
6 123 20 False
7 789 18 False
8 456 73 False
9 789 37 False
CodePudding user response:
Use Series.isin
with filtered acc_no
by condition:
df['ANY_50$_TRANSACTION'] = df['acc_no'].isin(df.loc[df['trans_amount'].eq(50), 'acc_no'])
Or GroupBy.transform
with GroupBy.any
:
df['ANY_50$_TRANSACTION'] = df['trans_amount'].eq(50).groupby(df['acc_no']).transform('any')
print (df)
trans_id acc_no trans_amount ANY_50$_TRANSACTION
0 1 123 15 True
1 2 456 10 False
2 3 123 50 True
3 4 456 60 False
4 5 789 35 False
5 6 123 20 True
6 7 789 18 False
7 8 456 73 False
8 9 789 37 False