I have a dataframe like this
ID Q001 Q002 Q003 Q004 Q005 Q006 Q007 Q008 Win
A 1 1 1 1 1 1 1 1 Yes
B 0 1 0 1 0 1 0 1 No
C 0 1 0 1 0 1 0 1 No
D 1 1 0 1 1 1 1 1 Yes
E 1 1 0 1 1 1 1 1 Yes
F 1 1 0 1 1 1 1 1 Yes
G 0 0 1 0 0 0 0 0 No
H 0 0 0 0 0 0 0 0 No
I 1 0 1 0 1 0 1 0 No
In the above dataframe, I want to create the colum 'Win' and assign the values 'Yes' if the sum of Q001 and Q002 is equal or higher than 2 and 'No', if lower than 2. How can I do this in Python?
CodePudding user response:
Use np.where()
to return a value conditional on other columns.
df['Win'] = np.where(df['Q001'] df['Q002'] >= 2, 'Yes', 'No')
CodePudding user response:
Check
df['Win'] = np.where(df[['Q001','Q002']].sum(1)>=2,'Yes','No')
df
Out[680]:
ID Q001 Q002 Q003 Q004 Q005 Q006 Q007 Q008 Win
0 A 1 1 1 1 1 1 1 1 Yes
1 B 0 1 0 1 0 1 0 1 No
2 C 0 1 0 1 0 1 0 1 No
3 D 1 1 0 1 1 1 1 1 Yes
4 E 1 1 0 1 1 1 1 1 Yes
5 F 1 1 0 1 1 1 1 1 Yes
6 G 0 0 1 0 0 0 0 0 No
7 H 0 0 0 0 0 0 0 0 No
8 I 1 0 1 0 1 0 1 0 No
CodePudding user response:
Simply use:
import numpy as np
cols = ['Q001', 'Q002']
df['Win'] = np.where(df[cols].sum(axis=1).ge(2),
'Yes', 'No')
You can scale this up to any number of columns.
Output:
ID Q001 Q002 Q003 Q004 Q005 Q006 Q007 Q008 Win
0 A 1 1 1 1 1 1 1 1 Yes
1 B 0 1 0 1 0 1 0 1 No
2 C 0 1 0 1 0 1 0 1 No
3 D 1 1 0 1 1 1 1 1 Yes
4 E 1 1 0 1 1 1 1 1 Yes
5 F 1 1 0 1 1 1 1 1 Yes
6 G 0 0 1 0 0 0 0 0 No
7 H 0 0 0 0 0 0 0 0 No
8 I 1 0 1 0 1 0 1 0 No
CodePudding user response:
This works
df = pd.DataFrame({'ID': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
'Q001': [1, 0, 0, 1, 1, 1, 0, 0, 1],
'Q002': [1, 1, 1, 1, 1, 1, 0, 0, 0],
'Q003': [1, 0, 0, 0, 0, 0, 1, 0, 1],
'Q004': [1, 1, 1, 1, 1, 1, 0, 0, 0],
'Q005': [1, 0, 0, 1, 1, 1, 0, 0, 1],
'Q006': [1, 1, 1, 1, 1, 1, 0, 0, 0],
'Q007': [1, 0, 0, 1, 1, 1, 0, 0, 1],
'Q008': [1, 1, 1, 1, 1, 1, 0, 0, 0]})
# initialize Win column
df['Win'] = 'No'
# assign Yes to Win column rows where Q001 Q002>=2
df.loc[df['Q001'] df['Q002'] >= 2, 'Win'] = 'Yes'
print(df)
ID Q001 Q002 Q003 Q004 Q005 Q006 Q007 Q008 Win
0 A 1 1 1 1 1 1 1 1 Yes
1 B 0 1 0 1 0 1 0 1 No
2 C 0 1 0 1 0 1 0 1 No
3 D 1 1 0 1 1 1 1 1 Yes
4 E 1 1 0 1 1 1 1 1 Yes
5 F 1 1 0 1 1 1 1 1 Yes
6 G 0 0 1 0 0 0 0 0 No
7 H 0 0 0 0 0 0 0 0 No
8 I 1 0 1 0 1 0 1 0 No
CodePudding user response:
You could calculate the column as a Boolean series and replace the values with Yes
and No
(if you must):
df['Win'] = (df['Q001'] df['Q002'] >= 2).replace({False: 'No', True: 'Yes'})