I have a dataframe looks like
Cnt_A Cnt_B Cnt_C Cnt_D
ID_1 0 1 3 0
ID_2 1 0 0 0
ID_3 5 2 0 8
...
I'd like to count columns that are not zero and put the result into new column like this,
Total_Not_Zero_Cols Cnt_A Cnt_B Cnt_C Cnt_D
ID_1 2 0 1 3 0
ID_2 1 1 0 0 0
ID_3 3 5 2 0 8
...
I did loop to get the result, but it took very long time (of course).
I can't figure out the most efficient way to calculate between columns with condition :(
Thank you in advance
CodePudding user response:
Check if each value not equals to 0 then sum on columns axis:
df['Total_Not_Zero_Cols'] = df.ne(0).sum(axis=1)
print(df)
# Output
Cnt_A Cnt_B Cnt_C Cnt_D Total_Not_Zero_Cols
ID_1 0 1 3 0 2
ID_2 1 0 0 0 3
ID_3 5 2 0 8 1
CodePudding user response:
Use ne
to generate a DataFrame of booleans with True for non-zeros values, then aggregate the rows as integers using sum
:
df['Total_Not_Zero_Cols'] = df.ne(0).sum(axis=1)
CodePudding user response:
Numpy based -
Use -
np.sum(df!=0, axis=1)
Output
ID_1 2
ID_2 1
ID_3 3
dtype: int64