Home > Software engineering >  The best way to combine 2 dfs
The best way to combine 2 dfs

Time:10-24

I have 2 dataframes with the same structure but with different data.

# DF1:
a b c
8 x x
2 x x
3 x x
1 x x
# DF2:
a b c
6 x x
3 x x
9 x x
8 x x

I need to combine it by "a" value, like this:

# NEW DF:
a_1 b_1 c_1 a_2 b_2 c_2
8   x   x   8   x   x
3   x   x   3   x   x

I can do this by writing a lot of functions and renaming the columns hardcode, but I'm sure it can be done more simply. Thanks!

CodePudding user response:

First you need find intersection between two DataFrame then You can use pandas.DataFrame.merge and set suffixes as you want like below:

>>> df1 = pd.DataFrame({'a': [1, 2, 3, 8], 'b': ['x', 'x', 'x', 'x'], 'c': ['x', 'x', 'x', 'x']}) 
>>> df2 = pd.DataFrame({'a': [8, 7, 3, 21], 'b': ['x', 'x', 'x', 'x'], 'c': ['x', 'x', 'x', 'x']})

>>> lst = set(df1['a']).intersection(df2['a'])
>>> df1 = df1[df1['a'].isin(lst)]
>>> df2 = df2[df2['a'].isin(lst)]

>>> df1.merge(df2, on='a', suffixes=('_1', '_2'))

    a   b_1  c_1    b_2  c_2
0   3   x    x      x    x
1   8   x    x      x    x
  • Related