I have df1
:
x y no.
0 -17.7 -0.785430 y1
1 -15.0 -3820.085000 y4
2 -12.5 2.138833 y3
.. .... ........ ..
40 15.6 5.486901 y2
41 19.2 1.980686 y3
42 19.6 9.364718 y2
and df2
:
delta y x
0 0.053884 -17.7
1 0.085000 -15.0
2 0.143237 -12.5
.. ........ ....
40 0.113099 15.6
41 0.102245 19.2
42 0.235282 19.6
They both have 43 rows, and x
column is exactly the same on both.
Somehow when I merge them on x
I get a df with 123 rows:
x y no. delta y
0 -17.7 -0.785430 y1 0.053884
1 -15.0 -3820.085000 y4 0.085000
2 -12.5 2.138833 y3 0.143237
3 -12.4 1.721205 y3 0.251180
4 -12.1 2.227343 y2 0.127343
.. ... ... .. ...
118 12.1 1.642526 y3 0.143886
119 14.4 2576.435000 y4 0.171000
120 15.6 5.486901 y2 0.113099
121 19.2 1.980686 y3 0.102245
122 19.6 9.364718 y2 0.235282
My input: final = df1.merge(df2, on="x")
CodePudding user response:
try the following: df1.join(df2)
join is a column-wise left join
pd.merge is a column-wise inner join
pd.concat is a row-wise outer join
pd.concat: takes Iterable arguments. Thus, it cannot take DataFrames directly (use [df,df2]) Dimensions of DataFrame should match along axis
Join and pd.merge: can take DataFrame arguments
ref: Merge two dataframes by index
CodePudding user response:
Try the following syntax and I encourage you to thoroughly read the official documentation of python, the link is given at the bottom. I think you might have different x values in df1 and df2 and they are not 100% identical. This could be perhaps because of the decimals.
import pandas as pd
left = pd.DataFrame(
{
"key": ["K0", "K1", "K2", "K3"],
"A": ["A0", "A1", "A2", "A3"],
"B": ["B0", "B1", "B2", "B3"],
}
)
right = pd.DataFrame(
{
"key": ["K0", "K1", "K2", "K3"],
"C": ["C0", "C1", "C2", "C3"],
"D": ["D0", "D1", "D2", "D3"],
}
)
result = pd.merge(left, right, on="key")