Home > Software design >  Multiple substring check in Python dataframe
Multiple substring check in Python dataframe

Time:04-04

I want to look for multiple substring in a particular column of python dataframe. The condition is that both the mentioned substrings need to be present and not just one of them,

i.e. it should return a new dataframe which contains both the substring 'ecosystem' and 'service'

df[df['Abstract'].str.contains('ecosystem', na=False)] & str.contains('service', na=False)]

I have tried using this one but it doesnt returns the intersection result but the both the sets including the intersection.

My requirement is the intersection

CodePudding user response:

Try this if your cells only contain ecosystem or service

df[(df['Abstract'] == 'ecosystem') & (df['Abstract'] == 'service')]

CodePudding user response:

If need check if both substrings exist per column chain both masks by & for bitwise AND:

df[(df['Abstract'].str.contains('ecosystem', na=False) & 
    df['Abstract'].str.contains('service', na=False))]
  • Related