I have two dataframes
ClientFileDf:
Date Clearing Member PAN Trading Member PAN CPCode CPPAN ClientPAN
0 1/10/2021 AAACS0581R AAACS0581R NaN NaN NaN
1 1/10/2021 AAACS0581R AAACS0581R NCDXAGR02 AACCG5797M AACPB3698K
2 1/10/2021 AAACS0581R AADCP2756J NaN NaN NaN
3 1/10/2021 AAACS0581R AAACS0581R NCDXAGR04 AACCG5797S AACPB3698K
4 1/10/2021 AAACS0581R AAACS0581R NCDXAGR06 AACCG5797M NaN
and
df_CPCodeDF:
HEDGE_CP_CODE HEDGE_PAN_NO
0 NCDXAGR02 AACCG5797M
1 NCDXAGR04 AACCG5797S
when I do Inner merge with columns I get resultant dataframe as below having index 0 and 1
Date Clearing Member PAN Trading Member PAN CPCode CPPAN ClientPAN
0 1/10/2021 AAACS0581R AAACS0581R NCDXAGR02 AACCG5797M AACPB3698K
1 1/10/2021 AAACS0581R AAACS0581R NCDXAGR04 AACCG5797S AACPB3698K
While I need is index as 1 and 3 respectively same as first dataframe
Required Df:
Date Clearing Member PAN Trading Member PAN CPCode CPPAN ClientPAN
1 1/10/2021 AAACS0581R AAACS0581R NCDXAGR02 AACCG5797M AACPB3698K
3 1/10/2021 AAACS0581R AAACS0581R NCDXAGR04 AACCG5797S AACPB3698K
I know index is ignored when doing merge with columns but is there a workaround or something I am missing ?
MY Code:
df2 = pd.merge(ClientFileDf, df_CPCodeDF, how='inner', left_on=['CPCode','CPPAN'], right_on = ['HEDGE_CP_CODE','HEDGE_PAN_NO'])
df2.drop(['HEDGE_CP_CODE','HEDGE_PAN_NO'],axis=1, inplace=True)
print(df2)
CodePudding user response:
Use reset_index()
to keep the index of ClientFileDf
and then set that index:
df2 = pd.merge(ClientFileDf.reset_index(), df_CPCodeDF, how='inner', \
left_on=['CPCode','CPPAN'], \
right_on = ['HEDGE_CP_CODE','HEDGE_PAN_NO']).set_index('index')