Home > OS >  Changing value based on another value in pandas
Changing value based on another value in pandas

Time:02-12

I have two DataFrames

df1:

col1_a col2_a
MAR1 R1
MAR2 R2
MAR3 R3

df2:

col1_b col2_b
MAR1 R1
MAR2 R20
MAR3
MAR4 R4
MAR5 R5

and I want

col1_b col2_b
MAR1 R1
MAR2 R2
MAR3 R3
MAR4 R4
MAR5 R5

So in the case col1_a == col1_b, I want to replace col2_b with col2_a

Unfortunately, I do not have a solution that works. Can anyone help?

CodePudding user response:

Assuming

df1 = pd.DataFrame({'col1_a':['MAR1','MAR2','MAR3'], 'col2_a':['R1', 'R2', 'R3']})

df2 = pd.DataFrame({'col1_b':['MAR1','MAR2','MAR3', 'MAR4', 'MAR5'], 'col2_b':['R1', 'R20', None, 'R4', 'R5']})

Then you can achieve this with:

df3 = df2.merge(df1, how='left', left_on='col1_b', right_on='col1_a')
selection_condition = pd.isna(df3["col2_a"])
df3["col2_a"].iloc[selection_condition] = df3[selection_condition]["col2_b"]

And

df3[["col1_b", "col2_a"]]

will have your desired columns.

CodePudding user response:

You could map "col2_a" values to "col1_b". This creates a Series where there is a value if "col1"s match, NaN otherwise. Then fill the NaN values with "col2_b" values:

df2['col2_b'] = df2['col1_b'].map(df1.set_index('col1_a')['col2_a']).fillna(df2['col2_b'])

Output:

  col1_b col2_b
0   MAR1     R1
1   MAR2     R2
2   MAR3     R3
3   MAR4     R4
4   MAR5     R5
  • Related