I have 2 dataframes with the same structure but with different data.
# DF1:
a b c
8 x x
2 x x
3 x x
1 x x
# DF2:
a b c
6 x x
3 x x
9 x x
8 x x
I need to combine it by "a" value, like this:
# NEW DF:
a_1 b_1 c_1 a_2 b_2 c_2
8 x x 8 x x
3 x x 3 x x
I can do this by writing a lot of functions and renaming the columns hardcode, but I'm sure it can be done more simply. Thanks!
CodePudding user response:
First you need find intersection
between two DataFrame then You can use pandas.DataFrame.merge
and set suffixes
as you want like below:
>>> df1 = pd.DataFrame({'a': [1, 2, 3, 8], 'b': ['x', 'x', 'x', 'x'], 'c': ['x', 'x', 'x', 'x']})
>>> df2 = pd.DataFrame({'a': [8, 7, 3, 21], 'b': ['x', 'x', 'x', 'x'], 'c': ['x', 'x', 'x', 'x']})
>>> lst = set(df1['a']).intersection(df2['a'])
>>> df1 = df1[df1['a'].isin(lst)]
>>> df2 = df2[df2['a'].isin(lst)]
>>> df1.merge(df2, on='a', suffixes=('_1', '_2'))
a b_1 c_1 b_2 c_2
0 3 x x x x
1 8 x x x x