I am trying to convert excel if else condition in python dataframe columns, can anyone help me out in this:
Input: df
Name1 Name2 Name3 Name4 Value1 Value2 Value3 Value4 MaxValue
0 John1 John2 John3 John4 10 3 5 7 10
1 Sony1 Sony2 Sony3 Sony4 2 12 4 8 12
2 Mark1 Mark2 Mark3 Mark4 5 13 0 3 13
3 Biky1 Biky2 Biky3 Biky4 7 7 5 44 44
4 Rose1 Rose2 Rose3 Rose4 7 0 9 7 9
Name values may not be ended with 1/2/3 etc this may have different name also.
Output: How to calculate the Final_Name column
Name1 Name2 Name3 Name4 Value1 Value2 Value3 Value4 MaxValue Final_Name
0 John1 John2 John3 John4 10 3 5 7 10 John1
1 Sony1 Sony2 Sony3 Sony4 2 12 4 8 12 Sony2
2 Mark1 Mark2 Mark3 Mark4 5 13 0 3 13 Mark2
3 Biky1 Biky2 Biky3 Biky4 7 7 5 44 44 Biky4
4 Rose1 Rose2 Rose3 Rose4 7 0 9 7 9 Rose3
In excel we, can write something like this:
=IF(I2=H2,D2,IF(I2=G2,C2,IF(I2=F2,B2,IF(I2=E2,A2,""))))
CodePudding user response:
You can first filter
the df into two parts , then we use the value position locate the Name
v = df.filter(regex = '^Value')
name = df.filter(regex = '^Name')
df['out'] = name.values[df.index, v.columns.get_indexer(v.idxmax(1))]
df
Out[188]:
Name1 Name2 Name3 Name4 Value1 Value2 Value3 Value4 MaxValue out
0 John1 John2 John3 John4 10 3 5 7 10 John1
1 Sony1 Sony2 Sony3 Sony4 2 12 4 8 12 Sony2
2 Mark1 Mark2 Mark3 Mark4 5 13 0 3 13 Mark2
3 Biky1 Biky2 Biky3 Biky4 7 7 5 44 44 Biky4
4 Rose1 Rose2 Rose3 Rose4 7 0 9 7 9 Rose3
CodePudding user response:
You can first create a column that will show you which 'Name' column should you return back using idxmax()
. Then you can stack()
your 'Name' columns and merge this result with your created column above based on index and 'Name':
# Create a helper column
v_c = [c for c in df if c.startswith('Value')]
df['id_col'] = df[v_c].idxmax(axis=1).str.replace('Value','Name')
# Merge the helper column with your stacked 'Name' columns
n_c = df.filter(like='Name').columns
res = pd.merge(df[n_c].stack().reset_index(),df[['id_col']].reset_index(),left_on=['level_0','level_1'], right_on=['index','id_col'])[0]
# Assign as a column
df['Final_Name'] = res
prints:
Name1 Name2 Name3 Name4 ... Value4 MaxValue id_col Final_Name
0 John1 John2 John3 John4 ... 7 10 Name1 John1
1 Sony1 Sony2 Sony3 Sony4 ... 8 12 Name2 Sony2
2 Mark1 Mark2 Mark3 Mark4 ... 3 13 Name2 Mark2
3 Biky1 Biky2 Biky3 Biky4 ... 44 44 Name4 Biky4
4 Rose1 Rose2 Rose3 Rose4 ... 7 9 Name3 Rose3
[5 rows x 11 columns]