This is simple, but I'm not sure where I'm going wrong. Looking to create third column that has a value of 1 if the two other columns also equal 1.
Here's the code I'm using. Advice is appreciated - I'm clearly very new to python. Thanks in advance!
conditions = [
(db3['voted_2018'] == 1) & (db3['voted_2020'] == 1)]
db3['voted_both_elections'] = np.select(conditions, 1,0)
CodePudding user response:
db3 = pd.DataFrame({'voted_2018': [1, 0, 1, 1], 'voted_2020': [1, 1, 0, 1]})
conditions = db3['voted_2018'].eq(1) & db3['voted_2020'].eq(1)
db3['voted_both_elections'] = np.where(conditions, 1, 0)
print(db3)
Shorter:
db3['voted_both_elections'] = (db3['voted_2018'].eq(1) & db3['voted_2020'].eq(1)).astype(int)
voted_2018 voted_2020 voted_both_elections
0 1 1 1
1 0 1 0
2 1 0 0
3 1 1 1
CodePudding user response:
You can first remove the square brackets in the definition of conditions
. The square brackets make the variable a list, but in your case using the Series object inside the brackets would be easier:
conditions = (db3['voted_2018'] == 1) & (db3['voted_2020'] == 1)
You can then use the DataFrame's .loc method with the conditions acting as a mask for row selection:
db3['voted_both_elections'] = 0
db3.loc[conditions, 'voted_both_elections'] = 1
An even simpler one-liner would be:
db3['voted_both_elections'] = conditions.astype(int)