Home > Net >  Compare two columns to create third column in python
Compare two columns to create third column in python

Time:12-05

This is simple, but I'm not sure where I'm going wrong. Looking to create third column that has a value of 1 if the two other columns also equal 1.

Here's the code I'm using. Advice is appreciated - I'm clearly very new to python. Thanks in advance!

conditions = [
    (db3['voted_2018'] == 1) & (db3['voted_2020'] == 1)]

db3['voted_both_elections'] = np.select(conditions, 1,0)

CodePudding user response:

db3 = pd.DataFrame({'voted_2018': [1, 0, 1, 1], 'voted_2020': [1, 1, 0, 1]})
conditions = db3['voted_2018'].eq(1) & db3['voted_2020'].eq(1)
db3['voted_both_elections'] = np.where(conditions, 1, 0)
print(db3)

Shorter:

db3['voted_both_elections'] = (db3['voted_2018'].eq(1) & db3['voted_2020'].eq(1)).astype(int)
   voted_2018  voted_2020  voted_both_elections
0           1           1                     1
1           0           1                     0
2           1           0                     0
3           1           1                     1

CodePudding user response:

You can first remove the square brackets in the definition of conditions. The square brackets make the variable a list, but in your case using the Series object inside the brackets would be easier:

conditions = (db3['voted_2018'] == 1) & (db3['voted_2020'] == 1)

You can then use the DataFrame's .loc method with the conditions acting as a mask for row selection:

db3['voted_both_elections'] = 0
db3.loc[conditions, 'voted_both_elections'] = 1

An even simpler one-liner would be:

db3['voted_both_elections'] = conditions.astype(int)
  • Related