I want to look for multiple substring in a particular column of python dataframe. The condition is that both the mentioned substrings need to be present and not just one of them,
i.e. it should return a new dataframe which contains both the substring 'ecosystem' and 'service'
df[df['Abstract'].str.contains('ecosystem', na=False)] & str.contains('service', na=False)]
I have tried using this one but it doesnt returns the intersection result but the both the sets including the intersection.
My requirement is the intersection
CodePudding user response:
Try this if your cells only contain ecosystem or service
df[(df['Abstract'] == 'ecosystem') & (df['Abstract'] == 'service')]
CodePudding user response:
If need check if both substrings exist per column chain both masks by &
for bitwise AND
:
df[(df['Abstract'].str.contains('ecosystem', na=False) &
df['Abstract'].str.contains('service', na=False))]