Home > Software engineering >  Word from string1 in word from string 2. if any
Word from string1 in word from string 2. if any

Time:08-19

I have two strings in dataframe. I need to check if any word from string2 is included in string1. ORIGINAL

I am using this split and it does not work.

 ((df['string1'].split()).eq(df['string2'].split())).any()

after running the code, it should return like this RESULT

I tried simple method and it works >>

TXT = 'HELLO WORLD, I AM TESTING'
TXT2 = 'HELLO TESTING'
X = TXT.split()
Y = TXT2.split()
any(i in X for i in Y)

--> python return "TRUE"

I don't know how to do this in dataframe and write additional column for the result

CodePudding user response:

If your dataframe isn't too big you can try with df.iterrows():

df = pd.DataFrame({'string1': ['Spam Ham Egg', 'Spam Bacon Spam'],
                   'string2': ['Beans Bacon Toast', 'Ham Egg Spam']})
result_list = []
for row in df.iterrows():
    string1, string2 = row[1][0].split(), row[1][1].split()
    result_list.append(any([element in string2 for element in string1]))
df.loc[:,'string_contained'] = result_list

CodePudding user response:

df['result'] = df.apply(lambda row: bool(set(row['string1'].split()) 
                                         & set(row['string2'].split())),
                        axis=1)

Example:

df = pd.DataFrame({'string1': ['a b c', 'd e f', 'd e f'], 
                   'string2': ['aa mn', 'a b c e', 'a b c ee']})

Result:

  string1   string2  result
0   a b c     aa mn   False
1   d e f   a b c e    True
2   d e f  a b c ee   False
  • Related