Suppose I have the following dataframe
# dictionary with list object in values
details = {
'A1' : [1,3,4,5],
'A2' : [2,3,5,6],
'A3' : [4,3,2,6],
}
# creating a Dataframe object
df = pd.DataFrame(details)
I want to query on each columns with the follow conditions to obtain a boolean mask and then perform the sum on axis=1
- A1 >= 3
- A2 >=3
- A3 >=4
I would like to end-up with the following dataframe
details = {
'A1' : [1,3,4,5],
'A2' : [2,3,5,6],
'A3' : [4,3,2,6],
'score' : [1,2,2,3]
}
# creating a Dataframe object
df = pd.DataFrame(details)
How would you do it?
CodePudding user response:
Try this
# dictionary with list object in values
details = {
'A1' : [1,3,4,5],
'A2' : [2,3,5,6],
'A3' : [4,3,2,6],
}
# creating a Dataframe object
df = pd.DataFrame(details)
df
# check boolean (convert to int) and add
df['score'] = df.A1.ge(3)*1 df.A2.ge(3)*1 df.A3.ge(4)*1
df
CodePudding user response:
Since your operators are the same, you can try numpy broadcasting
import numpy as np
df['score'] = (df.T >= np.array([3,3,4])[:, None]).sum()
print(df)
A1 A2 A3 score
0 1 2 4 1
1 3 3 3 2
2 4 5 2 2
3 5 6 6 3
CodePudding user response:
You could also do:
df.assign(score = (df >=[3,3,4]).sum(1))
A1 A2 A3 score
0 1 2 4 1
1 3 3 3 2
2 4 5 2 2
3 5 6 6 3