Home > Mobile >  Check if word contains substrings
Check if word contains substrings

Time:10-05

Question

Consider the following:

word = 'analphabetic'
df = pd.DataFrame({'substring': list('abcdefgh')   ['ab', 'phobic']})

substring is not necessarily a single letter!

I want to add a column with the name of word and each row it shows True/False whether the substring in that row is in word. Can I do this with a built-in pandas method?

Desired output:

  substring  analphabetic
0         a          True
1         b          True
2         c          True
3         d         False
4         e          True
5         f         False
6         g         False
7         h          True
8         ab         True
9         phobic    False

pandas.Series.str.contains

The other way around can be done by doing something like df.substring.str.contains(word). I guess you could do something like:

df[word] = [i in word for i in df.substring]

But then the built-in function str.contains() could be done by:

string = 'a'
df = pd.DataFrame({'words': ['these', 'are', 'some', 'random', 'words']})
df[string] = [string in i for i in df.words]

So my thought is that there is also a built-in method to do my trick.

CodePudding user response:

A possible solution (which should work for substrings longer than a single letter):

df['analphabetic'] = df['substring'].map(lambda x: x in word)

Output:

  substring  analphabetic
0         a          True
1         b          True
2         c          True
3         d         False
4         e          True
5         f         False
6         g         False
7         h          True

Using list comprehension:

df['analphabetic'] = [x in word for x in df.substring]

Using apply:

df['analphabetic'] = df['substring'].apply(lambda x: x in word)
  • Related