how to apply Vlookup inside Vlookup formula using Python-CodePudding

I want to update dataframe 01 with the help of dataframe 02 and try to achieve the result as shown in the Output table.

Right now I am doing this with the help of Excel by applying vlookup inside vlookup [u can see in attached image on formula bar ] but it's a lengthy and time consuming process so I want to achieve this result using python.

step 01 : I apply vlookup inside dataframe 01 within B3 cell and map the value from dataframe 02 from column 3, and then once I got the value then I again apply the vlookup on the same value and map the value from dataframe 02 form column 2, and pull the employee code and then same. process for renaming columns to pull employee ids.

How I can achieve this using python.enter image description here

|Dataframe 01|

DPNO	UNIT MANAGER'S EMP_ID	Sales Manager's EMP_ID
DP - 1183800
DP - 1183800
DP - 1143367
DP - 1171318
DP - 808400
DP - 1143367
DP - 1204783

|Dataframe 02|

DP No	Employee Code	Unit Manager's Dp No	Sales Manager's Dp No
DP - 1080416	INV879
DP - 1115172	INV944		DP - 1080416
DP - 1851410	INV1426		DP - 1080416
DP - 808400		DP - 1851410	DP - 1080416
DP - 1143367		DP - 1115172	DP - 1080416
DP - 1171318		DP - 1115172	DP - 1080416
DP - 1183800		DP - 1115172	DP - 1080416
DP - 1204783		DP - 1115172	DP - 1080416

|I want this type of OUTPUT Result|

DPNO	UNIT MANAGER'S EMP_ID	Sales Manager's EMP_ID
DP - 1183800	INV944	INV879
DP - 1183800	INV944	INV879
DP - 1143367	INV944	INV879
DP - 1171318	INV944	INV879
DP - 808400	INV1426	INV879
DP - 1143367	INV944	INV879
DP - 1204783	INV944	INV879

CodePudding user response：

You can solve this problem by doing a self-merge between dataframe #2 and itself. Specifically, you ask Pandas to match values in the "Unit Manager's Dp No" column with values in the "DP No" column. Once you have this mapping of DP No to managers employee ID, you can merge between dataframe #1 and the mapping.

It could also be done in the other order, but this seemed simpler to me.

Here's what the data looks like when imported into Pandas:

df1:

           DPNO  UNIT MANAGER'S EMP_ID  Sales Manager's EMP_ID
0  DP - 1183800                    NaN                     NaN
1  DP - 1183800                    NaN                     NaN
2  DP - 1143367                    NaN                     NaN
3  DP - 1171318                    NaN                     NaN
4   DP - 808400                    NaN                     NaN
5  DP - 1143367                    NaN                     NaN
6  DP - 1204783                    NaN                     NaN

df2:

          DP No Employee Code Unit Manager's Dp No Sales Manager's Dp No
0  DP - 1080416        INV879                  NaN                   NaN
1  DP - 1115172        INV944                  NaN          DP - 1080416
2  DP - 1851410       INV1426                  NaN          DP - 1080416
3   DP - 808400           NaN         DP - 1851410          DP - 1080416
4  DP - 1143367           NaN         DP - 1115172          DP - 1080416
5  DP - 1171318           NaN         DP - 1115172          DP - 1080416
6  DP - 1183800           NaN         DP - 1115172          DP - 1080416
7  DP - 1204783           NaN         DP - 1115172          DP - 1080416

Here's the code which does the lookup:

def lookup_manager(manager_df, output_col_name, type_="unit"):
    if type_ == "unit":
        lookup_col = "Unit Manager's Dp No"
    elif type_ == "sales":
        lookup_col = "Sales Manager's Dp No"
    else:
        raise Exception(f"Unknown manager type {type_}")
    manager_df = manager_df.merge(manager_df, how="left", left_on=lookup_col, right_on="DP No")
    manager_df = manager_df[["DP No_x", "Employee Code_y"]]
    manager_df.columns = ["DPNO", output_col_name]
    return manager_df

unit_managers = lookup_manager(df2, "UNIT MANAGER'S EMP_ID", "unit")
sales_managers = lookup_manager(df2, "Sales Manager's EMP_ID", "sales")
df_combined = df1[['DPNO']]
df_combined = df_combined.merge(unit_managers, how="left", left_on="DPNO", right_on="DPNO")
df_combined = df_combined.merge(sales_managers, how="left", left_on="DPNO", right_on="DPNO")

Here's the output:

           DPNO UNIT MANAGER'S EMP_ID Sales Manager's EMP_ID
0  DP - 1183800                INV944                 INV879
1  DP - 1183800                INV944                 INV879
2  DP - 1143367                INV944                 INV879
3  DP - 1171318                INV944                 INV879
4   DP - 808400               INV1426                 INV879
5  DP - 1143367                INV944                 INV879
6  DP - 1204783                INV944                 INV879