Identify instances where string exists more than once in a row Python, Pandas, Dataframe-CodePudding

I'm want to write a script that will identify instances where a word (string) appears in a row of a pandas dataframe more than once.

Using a lambda function I can identify the existence of a string in a row but but I can't find any information on how to identify '2 or more' instances of the string, this is an example of what I have currently:

df = pd.DataFrame({'ID':[1,2,3],'Ans1':['Yes','Yes','Yes'],'Ans2':['No','Yes','No'],'Ans3':['No','No','No']})
df['Result'] = df.apply(lambda row: row.astype(str).str.contains('Yes').any(), axis=1)

df

Pseudocode for what I'm trying to get:

if 'Yes' isin row > 1:
   df['Results'] == True

Desired result:

ID  Ans1    Ans2    Ans3    Result
1   Yes     No      No      False
2   Yes     Yes     No      True
3   Yes     No      No      False

CodePudding user response：

Try, you can do column filtering if you don't want to check the entire dataframe for yes, then use eq, equals to, and sum with axis=1 to sum values along rows then check to see if that sum is gt, greater than, 1:

df['Result'] = df.eq('Yes').sum(1).gt(1)

Output:

   ID Ans1 Ans2 Ans3  Result
0   1  Yes   No   No   False
1   2  Yes  Yes   No    True
2   3  Yes   No   No   False

CodePudding user response：

You could also do:

df['Result'] = df[df == 'Yes'].count(axis=1).gt(1)

CodePudding user response：

This code should do the trick for your specific case. It quite literally implements your pseudocode to every row.

def check_row(row):
    count = 0
    for i in row:
        if i == 'Yes':
            count  = 1
    if count > 1:
        return True
    else:
        return False
df['Results'] = df.apply(check_row, axis=1)