I am joining two data frames that have the same columns. I wanted to update the first dataframe. However, the my code creates additional columns but it is not updating.
My code:
left = pd.DataFrame({"key": ["K0", "K1", "K2", "K3"],
"A": ["NaN", "NaN", "NaN", "NaN"],
"B": ["B0", "B1", "B2", "B3"],})
right = pd.DataFrame({"key": ["K1", "K2", "K3"],
"A": ["C1", "C2", "C3"],
"B": [ "B1", "B2", "B3"]})
result = pd.merge(left, right, on="key",how='left')
Present output:
result =
key A_x B_x A_y B_y
0 K0 NaN B0 NaN NaN
1 K1 NaN B1 C1 B1
2 K2 NaN B2 C2 B2
3 K3 NaN B3 C3 B3
Expected output:
result =
key B A
0 K0 B0 NaN
1 K1 B1 C1
2 K2 B2 C2
3 K3 B3 C3
CodePudding user response:
Use combine_first
:
result = left.set_index("key").combine_first(right.set_index("key")).reset_index()
print(result)
Output
key A B
0 K0 NaN B0
1 K1 C1 B1
2 K2 C2 B2
3 K3 C3 B3