I have written the following chunk of code which does what it needs to do (retain the row as long as the string "Hello" appears in col1, col2, col3 or col4):
hello_mask = df["col1"].str.contains("Hello",na=False) | df["col2"].str.contains("Hello",na=False) | df["col3"].str.contains("Hello",na=False) | df["col4"].str.contains("Hello",na=False)
df_hello = df[hello_mask]
However, it is ugly and I think a For loop would be a much more elegant solution (I might be wrong). The problem is that I am not the most proficient in coding, and can't really put it together.
Thank you for your patience - any advice is appreciated!
CodePudding user response:
If you want to check if cell full match 'Hello', use:
cols_to_check = ['col1', 'col2', 'col3', 'col4']
hello_mask = df[cols_to_check].eq('Hello').any(axis=1)
If you want to check a partial string:
cols_to_check = ['col1', 'col2', 'col3', 'col4']
hello_mask = df[cols_to_check].apply(lambda x: x.str.contains('Hello')).any(axis=1)