Looking for an alternative cleaner solution to my for loop. I'm looking to create a column (df['result'])
based on the following logic:
Data:
d = {'a': [1,5], 'b': [2,4], 'c': [3,3], 'd':[4,2], 'e': [5,1]}
df = pd.DataFrame(d)
df['result'] = np.NaN
for i in range(len(df)):
if df['a'][i] == 1:
df['result'][i] = 1
elif df['b'][i] == 2:
df['result'][i] = 2
elif df['c'][i] == 3:
df['result'][i] = 3
elif df['d'][i] == 4:
df['result'][i] = 4
elif df['e'][i] == 5:
df['result'][i] = 5
else:
df['result'][i] = 0
Is there a cleaner way of creating this hierarchical logic without looping through like this?
CodePudding user response:
Use numpy.select
:
import numpy as np
df["result"] = np.select([df["a"].eq(1), df["b"].eq(2), df["c"].eq(3), df["d"].eq(4), df["e"].eq(5)],
[1,2,3,4,5],
0)
CodePudding user response:
IIUC, try this (incase you have many columns and dont want to code the index):
m = df.eq(range(1,len(df.columns) 1))
df['result'] = df.where(m).bfill(1).iloc[:,0]
print(df)
a b c d e result
0 1 2 3 4 5 1.0
1 5 4 3 2 1 3.0