I am getting a KeyError: 'Cust_id_2'
when I try to merge the following dataframes.
df =
Cust_id year is_sub
0 4 1516 is_sub
1 4 1920 is_sub
2 4 1819 is_sub
3 4 1718 is_sub
4 4 1617 is_sub
df2 =
Cust_id_2 year_freq_score
0 4 9.0
1 5 6.0
2 7 10.0
3 8 2.0
4 10 1.0
Most recently I have tried this:
result = pd.merge(
df,
df2[['year_freq_score']],
how='left',
left_on='Cust_id',
right_on='Cust_id_2'
)
df has 14,000 rows. df2 has 3,000 rows. df2 is a pivot table derived from df.
My first version had the Cust_id as the index of df2 and i tried to use 'right_index=True' which gave a KeyError.
I then reset the index and used 'on' columns having the same name (on='Cust_id
) which gave KeyError: 'Cust_id'
.
I then and changed df2 to ''Cust_id_2'' to isolate where the error was coming from and now receive 'KeyError: 'Cust_id_2''.
I've read through multiple posts on 'KeyError' but have not found (or understood) the solution to this issue.
Any help or pointers in the right direction greatly appreciated.
CodePudding user response:
You slice df2
to only keep year_freq_score
, so there is no more Cust_id_2
column for the merge.
Do instead:
result = pd.merge(
df,
df2,
how='left',
left_on='Cust_id',
right_on='Cust_id_2'
)