Home > Enterprise >  Merging two dataframes, trying to match a cell value to a column name - Pandas
Merging two dataframes, trying to match a cell value to a column name - Pandas

Time:11-27

I have two dataframes with country data, small examples as follows. As you can see one is a "vertical" table and the second is a "horizontal" table.

df1

Code Year
ALB  2000
ALB  2001
ALB  2002
ARG  2000
ARG  2001
ARG  2002

df2 values - I believe the column headers are strings hence the quotes

Code '2000' '2001' '2002'
ALB    1m     2m     3m
ARG    2m     4m     6m

The resulting dataframe I want is to "add" df2 values to df1

Code Year Value
ALB  2000  1m
ALB  2001  2m
ALB  2002  3m
ARG  2000  2m
ARG  2001  4m
ARG  2002  6m

I tried a merge. This is obviously not going to work as df2 doesn't have a year column.

df_new = df1.merge(df2, on=['Code', 'Year'], how='left')

I tried a concat. Didn't work either, just resulted in NaN in the columns.

df_new = pd.concat([df1, df2])

CodePudding user response:

You can use stack with merge.

Try this :

out = (
        df1.merge(df2.set_index("Code")
                     .stack()
                     .reset_index()
                     .rename(columns= {"level_1": "Year", 0: "Value"})
                     .assign(Year= lambda x: x["Year"].astype(int)),
                  on=["Code", "Year"], how="left")
      )

# Output :

print(out)

  Code  Year Value
0  ALB  2000    1m
1  ALB  2001    2m
2  ALB  2002    3m
3  ARG  2000    2m
4  ARG  2001    4m
5  ARG  2002    6m
  • Related