Home > Enterprise >  How can I classify a string with a partial string and make a boolean column
How can I classify a string with a partial string and make a boolean column

Time:11-22

Say I have the 1st dataframe with the following strings

a
abcd
dabcd
qwerty
oppoupou

Then I have a 2nd dataframe with the following substrings

column
abc
qw
qaz

I've been looking for a code that can classify the 1st dataframe and check each row with all the elements in the 2nd dataframe with a true or false solution. For example, for the first element, abcd it gets checked by the 2nd dataframe and it contains abc so abcd is true. Then the second element is also true because it contains abc. And the third element is true because it contains qw. Etc.

Then there would be this column with the 1st dataframe that would return: true, true, true, false

I found this code, but this only covers only the individual elements and not whole dataframes

df["b"] = df["a"].str.contains("abc")

Any suggestions for coding 2 different string dataframes for boolean?

CodePudding user response:

You need join values of column col in second DataFrame by | for regex OR:

df["b"] = df["a"].str.contains('|'.join(df2['column']))
print (df)
          a      b
0      abcd   True
1     dabcd   True
2    qwerty   True
3  oppoupou  False
  • Related