Home > OS >  Merging and Concating using pandas
Merging and Concating using pandas

Time:08-11

Hi I have 2 dataframes as below:

key Apple Banana
abc 1 12
bcd 23 21
key Train Car
abc 11 20
jkn 2 19

I want to merge these 2 dataframes together with my key column so that I can get following table:

key Train Car Banana Apple
abc 11 20 12 1
jkn 2 19 0/NA 0/NA
bcd 0/NA 0/NA 21 23

For columns where I don't have any record like for jkn / Apple either 0 or NA should be printed.

Currently I tried using pd.concat but am not exactly able to figure out how to get my desired result.

CodePudding user response:

Use pd.merge() with how='outer', read further in documentation:

import pandas as pd
import io

data_string = """key    Apple   Banana
abc 1   12
bcd 23  21
"""

df1 = pd.read_csv(io.StringIO(data_string), sep='\s ')

data_string = """key    Train   Car
abc 11  20
jkn 2   19"""
df2 = pd.read_csv(io.StringIO(data_string), sep='\s ')

# Solution
df_result = pd.merge(df1, df2, on=['key'], how='outer')
print(df_result)
   key  Apple  Banana  Train   Car
0  abc    1.0    12.0   11.0  20.0
1  bcd   23.0    21.0    NaN   NaN
2  jkn    NaN     NaN    2.0  19.0

CodePudding user response:

Let's try concat then groupby.sum

out = (pd.concat([df1, df2], ignore_index=True)
       .groupby('key', as_index=False).sum())
print(out)

   key  Apple  Banana  Train   Car
0  abc    1.0    12.0   11.0  20.0
1  bcd   23.0    21.0    0.0   0.0
2  jkn    0.0     0.0    2.0  19.0
  • Related