Home > Software engineering >  Why would a key double after a merge?
Why would a key double after a merge?

Time:11-01

I merge two datasets using following code:

output1 = classen_gravity.merge(fdi, how= "left", right_on=["Country", "Year", "regime"],
                               left_on=["country", "regime", "year"])

"regime" is supposed to stay as one column but after a merge, it becomes regime_x and regime_y.

Why would it happen?

CodePudding user response:

Is regime in both dataframes? x and y columns are created when the same column is in both dataframes

CodePudding user response:

Make sure that the order of the columns is the same:

output1 = classen_gravity.merge(
   fdi,
   how= "left",
   right_on=["Country", "Year", "regime"],
   left_on=["country", "year", "regime"] # this was ["country", "regime", "year"]
)

To avoid problems with the column naming, it's best to impose uniform naming:

fdi.columns = fdi.columns.str.lower()
output2 = classen_gravity.merge(
   fdi,
   how= "left",
   on=["country", "year", "regime"],
)
  • Related