Home > Blockchain >  count the "True" values in dataframe
count the "True" values in dataframe

Time:04-22

how to check if the selected columns in a dataframe return True and how to compare the count of True values with the column name count that is handling the count number of cells that are NOT NULL.

WHERE these selected columns can have a False as value.

for this i used the below statement but i do not know how to compare it with the column count

df=pd.DataFrame({'dept':['dept1','dept2','dept3','dept4','dept5'],
                 'subd':['hndf','nbf','asx','qwe','def'],
                 'jju':['0','1','1','NA','1'],
                 'rob':['1','0','NA','1','1'],
                 'ans':['0','0','1','NA','1'],
                 'zsd':['1','NA','1','1','1'],
                 'count':['4','3','3','2','4']}


        dept    subd      jju     rob  ans  zsd count
0      dept1    hndf       0      1     0      1    4
1      dept2     nbf       1      0     0      NA   3
2      dept3     asx       1      NA    1      1    3
3      dept4     qwe       NA     1    NA      1    2
4      dept5     def       1      1    1       1    4 



df = df[df.loc[:,'jju':'zsd'].eq("1")]

how to continue from this ???

expected result :

0      dept3     asx       1      NA    1      1    3
1      dept4     qwe       NA     1    NA      1    2
2      dept5     def       1      1    1       1    4 

CodePudding user response:

You can replace "NA" by NaN values, then sum on axis, then use eq to compare to count column value to create a boolean mask. Then use the mask to filter df:

msk = df['count'].astype(int).eq(df.loc[:, 'jju':'zsd'].replace('NA', float('nan')).astype(float).eq(1).sum(axis=1))
out = df[msk]

Output:

    dept subd jju rob ans zsd count
2  dept3  asx   1  NA   1   1     3
3  dept4  qwe  NA   1  NA   1     2
4  dept5  def   1   1   1   1     4

CodePudding user response:

This does the job,

df.replace("NA", np.nan, inplace = True)
df.iloc[:, 2:] = df.iloc[:, 2:].apply(pd.to_numeric, errors = "coerce", axis = 1)

Output -

dept subd jju rob ans zsd count
0 dept1 hndf 0.0 1.0 0.0 1.0 4.0
1 dept2 nbf 1.0 0.0 0.0 nan 3.0
2 dept3 asx 1.0 nan 1.0 1.0 3.0
3 dept4 qwe nan 1.0 nan 1.0 2.0
4 dept5 def 1.0 1.0 1.0 1.0 4.0
  • Related