Home > Software design >  Check if any word in a dataframe in column A exist in column B
Check if any word in a dataframe in column A exist in column B

Time:04-04

Is there an elegant way to check whether a word in column A is found in column B? Desired outcome:

ID   | A  |B      |C
------------------------
1    |A b | a b   |True
2    |bc c| bC acc|True 
3    |c   | ba    |False

I'm currenty saving each word in a string with split() to a list, turn everything to upper case and run any()

CodePudding user response:

If need test words splitted by space in pairs - first word by A with first ford in B, similar second word with second word use nested list comprehension:

df['test'] = [any(x in y for x, y in zip((a.lower().split()), b.lower().split())) 
                  for a, b in zip(df.A, df.B) ]
print (df)
   ID     A       B      C   test
0   1   A b     a b   True   True
1   2  bc c  bC acc   True   True
2   3     c      ba  False  False

If need check each splitted word of column A by splitted words of column B use:

df['test'] = [any(any(x in y for x in a.lower().split()) for y in  b.lower().split()) 
                  for a, b in zip(df.A, df.B)]
  • Related