I have 2 dataframes. I want to update the 1st dataframe(df1) by adding the column values w.r.t A from 2nd dataframe(df2). I have already created new column(C) at the index(which will be a variable).
df1
A B
100 3454
150 2343
200 7655
250 3454
300 4565
df2
A C
200 4565
250 6647
300 9865
350 4653
400 0776
result df
A B C
100 3454
150 2343
200 7655 4565
250 3454 6647
300 4565 9865
350 4653
400 0776
CodePudding user response:
You need an outer join:
df1 = pd.DataFrame({'A': {0: 100, 1: 150, 2: 200, 3: 250, 4: 300},
'B': {0: 3454, 1: 2343, 2: 7655, 3: 3454, 4: 4565}})
df2 = pd.DataFrame({'A': {0: 200, 1: 250, 2: 300, 3: 350, 4: 400},
'C': {0: 4565, 1: 6647, 2: 9865, 3: 4653, 4: 776}})
df1.merge(df2, on=["A"], how="outer")
A B C
0 100 3454.0 NaN
1 150 2343.0 NaN
2 200 7655.0 4565.0
3 250 3454.0 6647.0
4 300 4565.0 9865.0
5 350 NaN 4653.0
6 400 NaN 776.0
CodePudding user response:
Although merge
can do the job, one can use join
if a more efficient way is needed while the dataset is huge.
df1.set_index('A').join(df2.set_index('A'), how='outer').reset_index()