I have two DataFrames
df1
:
col1_a col2_a
MAR1 R1
MAR2 R2
MAR3 R3
df2
:
col1_b col2_b
MAR1 R1
MAR2 R20
MAR3
MAR4 R4
MAR5 R5
and I want
col1_b col2_b
MAR1 R1
MAR2 R2
MAR3 R3
MAR4 R4
MAR5 R5
So in the case col1_a == col1_b, I want to replace col2_b with col2_a
Unfortunately, I do not have a solution that works. Can anyone help?
CodePudding user response:
Assuming
df1 = pd.DataFrame({'col1_a':['MAR1','MAR2','MAR3'], 'col2_a':['R1', 'R2', 'R3']})
df2 = pd.DataFrame({'col1_b':['MAR1','MAR2','MAR3', 'MAR4', 'MAR5'], 'col2_b':['R1', 'R20', None, 'R4', 'R5']})
Then you can achieve this with:
df3 = df2.merge(df1, how='left', left_on='col1_b', right_on='col1_a')
selection_condition = pd.isna(df3["col2_a"])
df3["col2_a"].iloc[selection_condition] = df3[selection_condition]["col2_b"]
And
df3[["col1_b", "col2_a"]]
will have your desired columns.
CodePudding user response:
You could map "col2_a" values to "col1_b". This creates a Series where there is a value if "col1"s match, NaN otherwise. Then fill the NaN values with "col2_b" values:
df2['col2_b'] = df2['col1_b'].map(df1.set_index('col1_a')['col2_a']).fillna(df2['col2_b'])
Output:
col1_b col2_b
0 MAR1 R1
1 MAR2 R2
2 MAR3 R3
3 MAR4 R4
4 MAR5 R5