Let's take this sample dataframe :
df = pd.DataFrame({'ID':[1,2,3],'Col1':[True,False,False],'Col2':[False,False,False], 'Col3':[True,False,True]})
ID Col1 Col2 Col3
0 1 True False True
1 2 False False False
2 3 False False True
I would like to select rows of df having at least one True. I can of course do the following :
df[df["Col1"] | df["Col2"] | df["Col3"]]
But my real dataframe has a lot of columns and consider I don't know their names. How please could I do ?
Expected output :
ID Col1 Col2 Col3
0 1 True False True
2 3 False False True
CodePudding user response:
There is any
function for the exact purpose:
df = df.set_index('ID', drop=True)
print(df[df.any(axis='columns')]
# output:
Col1 Col2 Col3
ID
1 True False True
3 False False True
or without resetting the index:
print(df[df[['Col1', 'Col2', 'Col3']].any(axis='columns')])
# output:
ID Col1 Col2 Col3
0 1 True False True
2 3 False False True
CodePudding user response:
you can use .select_dtypes
and boolean filtering with .loc
df.loc[df.select_dtypes('bool').sum(axis=1).ge(1)]
ID Col1 Col2 Col3
0 1 True False True
2 3 False False True