Home > Mobile >  for-loop pandas df alternative
for-loop pandas df alternative

Time:03-31

Looking for an alternative cleaner solution to my for loop. I'm looking to create a column (df['result']) based on the following logic:

Data:

d = {'a': [1,5], 'b': [2,4], 'c': [3,3], 'd':[4,2], 'e': [5,1]}
df = pd.DataFrame(d)
df['result'] = np.NaN
for i in range(len(df)):
   if df['a'][i] == 1:
      df['result'][i] = 1
   elif df['b'][i] == 2:
      df['result'][i] = 2
   elif df['c'][i] == 3:
      df['result'][i] = 3
   elif df['d'][i] == 4:
      df['result'][i] = 4
   elif df['e'][i] == 5:
      df['result'][i] = 5
   else:
      df['result'][i] = 0

Is there a cleaner way of creating this hierarchical logic without looping through like this?

CodePudding user response:

Use numpy.select:

import numpy as np
df["result"] = np.select([df["a"].eq(1), df["b"].eq(2), df["c"].eq(3), df["d"].eq(4), df["e"].eq(5)], 
                         [1,2,3,4,5], 
                         0)

CodePudding user response:

IIUC, try this (incase you have many columns and dont want to code the index):

m = df.eq(range(1,len(df.columns) 1))
df['result'] = df.where(m).bfill(1).iloc[:,0]

print(df)
   a  b  c  d  e  result
0  1  2  3  4  5     1.0
1  5  4  3  2  1     3.0
  • Related