Home > Net >  Dataframe take rows that have exactly one occurence of a value from list
Dataframe take rows that have exactly one occurence of a value from list

Time:05-26

Code:

import pandas as pd 

df = pd.DataFrame({'data': ['hey how r u', 'hello', 'hey abc d e f hey f', 'g h i i j k', 'hello how r u hello']})
vals = ['hey', 'hello']

I want to take all the rows that have exactly one word that is in the list vals. In this case, these would be 'hey how r u', 'hello'

What I tried:

def exactly_one(text):
    for v in vals:
        if text.count(v) > 1:
            return False
    return True


df = df[df['data'].contains('|'.join(vals)) & (exactly_one(df['data'].str))]

Breaks with an error

CodePudding user response:

You can use str.count with a regex:

df[df['data'].str.count('|'.join(vals)).eq(1)]

Output:

          data
0  hey how r u
1        hello

Intermediate:

df['data'].str.count('|'.join(vals))

0    1
1    1
2    2
3    0
4    2
Name: data, dtype: int64
  • Related