I have two data frames with different number of both rows and columns. Some indices and columns are common.
df1
A B C
a 1 0 0
b 2 3 4
c 4 5 6
df2
A C D E
a 1 0 2 3
b 2 4 2 4
d 1 4 5 6
e 1 2 3 4
The result should be
A B C D E
a 1 0 0 2 3
b 2 3 4 2 4
c 4 5 6 NA NA
d 1 NA 4 5 6
e 1 NA 2 3 4
I tried "join", "merge", and "concat" but it wasn't successful. Would you help me out?
Thanks.
CodePudding user response:
You can use combine_first
df = df1.combine_first(df2)
Result
A B C D E
a 1 0.0 0 2.0 3.0
b 2 3.0 4 2.0 4.0
c 4 5.0 6 NaN NaN
d 1 NaN 4 5.0 6.0
e 1 NaN 2 3.0 4.0
CodePudding user response:
You are looking for bind_rows
from dplyr
package :
df=data.frame(A=c(1,2,3),B=c(2,3,6))
df2=data.frame(A=c(1,1,1,2),B=c(4,5,7,4),C=c(1,1,1,1))
dplyr::bind_rows(df,df2)
Result :
A B C
1 1 2 NA
2 2 3 NA
3 3 6 NA
4 1 4 1
5 1 5 1
6 1 7 1
7 2 4 1
CodePudding user response:
Here is a way to do that without using combine_first
pd.concat([df.stack(), df1.stack()]).reset_index().drop_duplicates().pivot("level_0", "level_1", 0)
Results:
level_1 A B C D E
level_0
a 1.0 0.0 0.0 2.0 3.0
b 2.0 3.0 4.0 2.0 4.0
c 4.0 5.0 6.0 NaN NaN
d 1.0 NaN 4.0 5.0 6.0
e 1.0 NaN 2.0 3.0 4.0
CodePudding user response:
Another possible solution:
(pd.merge(df1.reset_index(), df2.reset_index(), how='outer')
.set_index('index').rename_axis(None))
Output:
A B C D E
a 1 0.0 0 2.0 3.0
b 2 3.0 4 2.0 4.0
c 4 5.0 6 NaN NaN
d 1 NaN 4 5.0 6.0
e 1 NaN 2 3.0 4.0