Home > Software engineering >  How to convert excel if else condition in python dataframe column
How to convert excel if else condition in python dataframe column

Time:02-18

I am trying to convert excel if else condition in python dataframe columns, can anyone help me out in this:

Input: df

   Name1  Name2  Name3  Name4  Value1  Value2  Value3  Value4  MaxValue
0  John1  John2  John3  John4      10       3       5       7        10
1  Sony1  Sony2  Sony3  Sony4       2      12       4       8        12
2  Mark1  Mark2  Mark3  Mark4       5      13       0       3        13
3  Biky1  Biky2  Biky3  Biky4       7       7       5      44        44
4  Rose1  Rose2  Rose3  Rose4       7       0       9       7         9

Name values may not be ended with 1/2/3 etc this may have different name also.

Output: How to calculate the Final_Name column

   Name1  Name2  Name3  Name4  Value1  Value2  Value3  Value4  MaxValue Final_Name
0  John1  John2  John3  John4      10       3       5       7        10      John1
1  Sony1  Sony2  Sony3  Sony4       2      12       4       8        12      Sony2
2  Mark1  Mark2  Mark3  Mark4       5      13       0       3        13      Mark2
3  Biky1  Biky2  Biky3  Biky4       7       7       5      44        44      Biky4
4  Rose1  Rose2  Rose3  Rose4       7       0       9       7         9      Rose3

In excel we, can write something like this:

=IF(I2=H2,D2,IF(I2=G2,C2,IF(I2=F2,B2,IF(I2=E2,A2,""))))

enter image description here

CodePudding user response:

You can first filter the df into two parts , then we use the value position locate the Name

v = df.filter(regex = '^Value')
name = df.filter(regex = '^Name')
df['out'] = name.values[df.index, v.columns.get_indexer(v.idxmax(1))]
df
Out[188]: 
   Name1  Name2  Name3  Name4  Value1  Value2  Value3  Value4  MaxValue    out
0  John1  John2  John3  John4      10       3       5       7        10  John1
1  Sony1  Sony2  Sony3  Sony4       2      12       4       8        12  Sony2
2  Mark1  Mark2  Mark3  Mark4       5      13       0       3        13  Mark2
3  Biky1  Biky2  Biky3  Biky4       7       7       5      44        44  Biky4
4  Rose1  Rose2  Rose3  Rose4       7       0       9       7         9  Rose3

CodePudding user response:

You can first create a column that will show you which 'Name' column should you return back using idxmax(). Then you can stack() your 'Name' columns and merge this result with your created column above based on index and 'Name':

# Create a helper column
v_c = [c for c in df if c.startswith('Value')]
df['id_col'] = df[v_c].idxmax(axis=1).str.replace('Value','Name')

# Merge the helper column with your stacked 'Name' columns
n_c = df.filter(like='Name').columns
res = pd.merge(df[n_c].stack().reset_index(),df[['id_col']].reset_index(),left_on=['level_0','level_1'], right_on=['index','id_col'])[0]

# Assign as a column
df['Final_Name'] = res

prints:

   Name1  Name2  Name3  Name4  ...  Value4  MaxValue  id_col  Final_Name
0  John1  John2  John3  John4  ...       7        10   Name1       John1
1  Sony1  Sony2  Sony3  Sony4  ...       8        12   Name2       Sony2
2  Mark1  Mark2  Mark3  Mark4  ...       3        13   Name2       Mark2
3  Biky1  Biky2  Biky3  Biky4  ...      44        44   Name4       Biky4
4  Rose1  Rose2  Rose3  Rose4  ...       7         9   Name3       Rose3

[5 rows x 11 columns]
  • Related