Home > Blockchain >  create a dataframe mask and sum columns
create a dataframe mask and sum columns

Time:05-27

Suppose I have the following dataframe

# dictionary with list object in values
details = {
    'A1' : [1,3,4,5],
    'A2' : [2,3,5,6],
    'A3' : [4,3,2,6],
}
  
# creating a Dataframe object 
df = pd.DataFrame(details)

I want to query on each columns with the follow conditions to obtain a boolean mask and then perform the sum on axis=1

  • A1 >= 3
  • A2 >=3
  • A3 >=4

I would like to end-up with the following dataframe

details = {
    'A1' : [1,3,4,5],
    'A2' : [2,3,5,6],
    'A3' : [4,3,2,6],
    'score' : [1,2,2,3]
}
  
# creating a Dataframe object 
df = pd.DataFrame(details)

How would you do it?

CodePudding user response:

Try this

# dictionary with list object in values
details = {
    'A1' : [1,3,4,5],
    'A2' : [2,3,5,6],
    'A3' : [4,3,2,6],
}
  
# creating a Dataframe object 
df = pd.DataFrame(details)
df

# check boolean (convert to int) and add
df['score'] = df.A1.ge(3)*1   df.A2.ge(3)*1   df.A3.ge(4)*1
df

enter image description here

CodePudding user response:

Since your operators are the same, you can try numpy broadcasting

import numpy as np

df['score'] = (df.T >= np.array([3,3,4])[:, None]).sum()
print(df)

   A1  A2  A3  score
0   1   2   4      1
1   3   3   3      2
2   4   5   2      2
3   5   6   6      3

CodePudding user response:

You could also do:

df.assign(score = (df >=[3,3,4]).sum(1))
 
   A1  A2  A3  score
0   1   2   4      1
1   3   3   3      2
2   4   5   2      2
3   5   6   6      3
  • Related