I am relatively new to python and I am wondering how I can merge these two tables and preserve both their values?
Consider these two tables:
df = pd.DataFrame([[1, 3], [2, 4],[2.5,1],[5,6],[7,8]], columns=['A', 'B'])
A B
1 3
2 4
2.5 1
5 6
7 8
df2 = pd.DataFrame([[1],[2],[3],[4],[5],[6],[7],[8]], columns=['A'])
A
1
2
...
8
I want to obtain the following result:
A B
1 3
2 4
2.5 1
3 NaN
4 NaN
5 6
6 NaN
7 8
8 NaN
You can see that column A includes all values from both the first and second dataframe in an ordered manner. I have attempted:
pd.merge(df,df2,how='outer')
pd.merge(df,df2,how='right')
But the former does not result in an ordered dataframe and the latter does not include rows that are unique to df.
CodePudding user response:
Let us do concat
then drop_duplicates
out = pd.concat([df2,df]).drop_duplicates('A',keep='last').sort_values('A')
Out[96]:
A B
0 1.0 3.0
1 2.0 4.0
2 2.5 1.0
2 3.0 NaN
3 4.0 NaN
3 5.0 6.0
5 6.0 NaN
4 7.0 8.0
7 8.0 NaN