Home > Software engineering >  Loop through OR condition in Pandas dataframe
Loop through OR condition in Pandas dataframe

Time:03-30

I have written the following chunk of code which does what it needs to do (retain the row as long as the string "Hello" appears in col1, col2, col3 or col4):

hello_mask = df["col1"].str.contains("Hello",na=False) | df["col2"].str.contains("Hello",na=False) | df["col3"].str.contains("Hello",na=False) | df["col4"].str.contains("Hello",na=False)
 
df_hello = df[hello_mask]

However, it is ugly and I think a For loop would be a much more elegant solution (I might be wrong). The problem is that I am not the most proficient in coding, and can't really put it together.

Thank you for your patience - any advice is appreciated!

CodePudding user response:

If you want to check if cell full match 'Hello', use:

cols_to_check = ['col1', 'col2', 'col3', 'col4']
hello_mask = df[cols_to_check].eq('Hello').any(axis=1)

If you want to check a partial string:

cols_to_check = ['col1', 'col2', 'col3', 'col4']
hello_mask = df[cols_to_check].apply(lambda x: x.str.contains('Hello')).any(axis=1)
  • Related