I have a pandas column like this where amount is a string column:
id amount possible_amount
0 1.00 ['1.00', '2.00', '3.00']
1 45.00 ['100.00', '45.00']
2 37.00 ['29.00', '38.00']
I want to create a new column called 'match' whose value will be True
if amount
is in the possible_amount
list and False
otherwise. So expected results for example above is:
id amount possible_amount match
0 1.00 ['1.00', '2.00', '3.00'] True
1 45.00 ['100.00', '45.00'] True
2 37.00 ['29.00', '38.00'] False
I've tried couple different ways, below being one of them. Also tried using str.contains()
to no avail.
df['match'] = np.where(df['amount'].isin(df['possible_amount']), True, False)
But this only returns all False in match
.
CodePudding user response:
Convert values to floats and compare in list comprehension:
df['match'] = [a in list(map(float, b)) for a, b in zip(df['amount'],df['possible_amount'])]
print (df)
id amount possible_amount match
0 0 1.0 [1.00, 2.00, 3.00] True
1 1 45.0 [100.00, 45.00] True
2 2 37.0 [29.00, 38.00] False
Another idea, obviously slowier:
df['match'] = (df.explode('possible_amount')
.assign(possible_amount = lambda x: x['possible_amount'].astype(float),
new = lambda x: x['possible_amount'].eq(x['amount']))
.groupby(level=0)['new']
.any()
)
print (df)
id amount possible_amount match
0 0 1.0 [1.00, 2.00, 3.00] True
1 1 45.0 [100.00, 45.00] True
2 2 37.0 [29.00, 38.00] False