Home > Enterprise >  .str.contains returning actual found value instead of True or False
.str.contains returning actual found value instead of True or False

Time:12-20

I am using str.contains in my dataframe to see if a certain value is inside the values of a Series.

Instead of the output being True or False, I want to see the actual value that I pass inside the contains.

A     B
1   Fer
2   Ger
3   Tir    

My expected output:

A     B    C
1   Fer   er
2   Ger   er
3   Tir  Nan 

Is there a built-in way to do this with pandas?

CodePudding user response:

Series.str.extract is perfect for this:

df['C'] = df['B'].str.extract('(er)')

Output:

>>> df
   A    B    C
0  1  Fer   er
1  2  Ger   er
2  3  Tir  NaN

The parentheses in (er) are important; they signify a capture group. If the regular expression within them matches any text, that matched text will be copied into the output column. If the regular expression doesn't match, NaN is copied to the output column. .str.extract returns a dataframe with one column per capture group, so (er)(abc)(def) would return a dataframe with 3 columns.

  • Related