I have problem with my data. So I want to check a value for column A in column B which contains several values separated by comma. The result that I want is when the value is exist, it will fill column C with True, otherwise it will fill with False.
Sample table like this:
Column_A | Column_B | Column_C |
---|---|---|
A | A,B,C,AA,BB,CC | True |
B | A,AA,BB,CC | False |
C | A,B,C | True |
I already use something like this .apply(lambda x: x.Column_A in x.Column_B, axis=1)
but it resulted the second row as True because it detect B from BB. Basically my script doesn't the comma as separator for different value.
Any solution for my problem?
CodePudding user response:
Use split
:
df['Column_C'] = df.apply(lambda x: x.Column_A in x.Column_B.split(', '), axis=1)
If performance is important use list comprehension:
df['Column_C'] = [a in b.split(', ') for a, b in zip(df.Column_A, df.Column_B)]
CodePudding user response:
df['Column_C'] = df.apply(lambda x: x.Column_A in x.Column_B.split(','), axis=1)