I have a dataframe like the one shown in the image.
I need to select the rows which have more than 2 columns having non zero values. So the highlighted yellow ones only need to be selected. I tried
df[(df.iloc[:, 1:13] != 0.0).any(1)]
but it's for any 1 column how to do it for more than 1 column.
CodePudding user response:
My suggestion would be to generate a True/False filter as:
index_filter = df.apply(lambda x: True if sum(x != 0) > 1 else False, axis=1)
index_filter
will be True
at the indices where at least two of the dataframe columns are not 0.
Now you can filter the dataframe as df[index_filter]
.
Another alternative would be:
num_cols_not_zero = df.ne(0).sum(axis=1)
df[num_cols_not_zero.gt(1)]
CodePudding user response:
Try this: main_df.loc[:,main_df.notnull().sum() > 1]