Home > Blockchain >  VlookUp in python and select certain columns
VlookUp in python and select certain columns

Time:08-22

I want to perform vlookup on two excel files, and select only certain column in second excel file

Here's Book1.xlsx

BILL_ID     APPROVED_BY BILL_DESCRIPTION      BILLED_AMOUNT BILL_CONTROL_NUMBER

163191467   111         Loan Repayment        6792731.25    991110749346 

167840287   111         Loan Repayment        9899385       991110840923 

166618791   111         Loan Repayment        4098285       991110809688 

166618796   111         Loan Repayment        960521.25     991110809690 

164074843   111         Loan Repayment        18415107      991110772113 

173800261   18          LOAN REPAYMENT        93000000      991110852243 

163874933   111         Loan Repayment        5845935       991110767830 

167256844   111         Loan Repayment        4472520       991110827054 

180746273   7           LOAN REPAYMENT        6948435.7     991110852786 

Here's Book2.xlsx

Name    Control_number  Start_date        end_date

Tony    991110749346    1/31/2031 23:59   3/24/2021 15:03

Alex    991110759417    7/23/2030 23:59   7/23/2020 19:52

Jamila  991110849017    9/8/2030 23:59    9/10/2020 9:55

Fred    991110812010    8/19/2030 23:59   8/19/2020 0:26

Rosena  991110809690    6/11/2021 23:59   5/12/2021 17:41

Thobias 991110854526    6/10/2021 23:59   5/11/2021 8:00

George  991110822870    8/23/2030 23:59   8/23/2020 16:35

Foreman 991110831014    7/5/2030 23:59    8/26/2020 9:55

Tyson   991110852552    11/12/2030 23:59  11/14/2020 13:55

Genry   991110852243    8/23/2030 23:59   8/23/2020 20:04

Gastory 991110853244    12/28/2030 23:59  2/18/2021 14:16

Shebby  991110827054    6/6/2021 23:59    5/7/2021 11:11

Here's desired output

BILL_ID     APPROVED_BY BILL_DESCRIPTION    BILLED_AMOUNT   BILL_CONTROL_NUMBER  Name   Control_number

163191467   111         Loan Repayment      6792731.25      991110749346         Tony   991110749346 

166618796   111         Loan Repayment      960521.25       991110809690         Rosena 991110809690 

173800261   18          LOAN REPAYMENT      93000000        991110852243         Genry  991110852243 

167256844   111         Loan Repayment      4472520         991110827054         Shebby 991110827054 

Here's my code

import pandas as pd
df1 = pd.read_excel('Book1.xlsx')
df2 = pd.read_excel('Book2.xlsx')
df1.head()
df2.head()
inner_join = pd.merge(df1,df2,left_on='BILL_CONTROL_NUMBER',right_on='Control_number')

CodePudding user response:

Change the line

inner_join = pd.merge(df1,df2,left_on='BILL_CONTROL_NUMBER',right_on='Control_number')

to this

inner_join = pd.merge(df1,df2[['Name','Control_number']],left_on='BILL_CONTROL_NUMBER',right_on='Control_number')

You can specify columns you want using [['column1','column2','etc..']]

  • Related