Home > Blockchain >  Pandas. Mergins with duplicates
Pandas. Mergins with duplicates

Time:08-04

I have two DFs.

df1 is smth like:

pins     date       transaction
UANS12   01.02.03   560
UANS12   03.04.03   976
UJJJS9   03.02.04   876

df2 is smth like:

pins     registration_ date 
UANS12   01.02.03
UJJJS9   09.01.01

I need to compare pins in df1 with registration date in df2. And add a new column in df1 with regdate. I tried to use:

new = pd.merge(df1, df2[["registration_date"]], on="pins", how="right")

but got error :

   # Check for duplicates
   KeyError: 'pins'

CodePudding user response:

hi you re trying to merge df1 with a dataframe that not contains 'pins' columns when you use df2[["registration_date"]] you create a dataframe with a single column instead you need to use df2

CodePudding user response:

Should be merge() on how='left'

df1 = df1.merge(df2, on='pins', how='left')
###
     pins      date  transaction registration_date
0  UANS12  01.02.03          560          01.02.03
1  UANS12  03.04.03          976          01.02.03
2  UJJJS9  03.02.04          876          09.01.01

or by join()

df1['registration_date'] = df1.set_index('pins').join(df2.set_index('pins'), on='pins', how='left')['registration_date'].values
###
     pins      date  transaction registration_date
0  UANS12  01.02.03          560          01.02.03
1  UANS12  03.04.03          976          01.02.03
2  UJJJS9  03.02.04          876          09.01.01

or by map()

df1['registration_date'] = df1['pins'].map(dict(zip(df2['pins'], df2['registration_date'])))
df1
###
     pins      date  transaction registration_date
0  UANS12  01.02.03          560          01.02.03
1  UANS12  03.04.03          976          01.02.03
2  UJJJS9  03.02.04          876          09.01.01
  • Related