Home > OS >  Np.where change value in column if another column value is in another dataframe column
Np.where change value in column if another column value is in another dataframe column

Time:11-18

Let me explain the structure of the problem that I'm trying to solve. Let's suppose that we have two dataframes

DF1:

ID Value
AA 2
AB 1
AC 2
AD 1
AE 2

DF2:

ID New Value
AA 1
AC 1

If the ID column row in DF1 is in DF2, then I would like to change the value in the same row in DF1 to the one that it has in DF2, so the end result would be something like this:

DF1:

ID Value
AA 1
AB 1
AC 1
AD 1
AE 2

So far, I have tried attempts with .loc and np.where but none of them where successful, my closest attempt is the following line of code:

DF1['Value'][row] = [DF2['New Value'][row] if ((DF1['ID'][row]).isin(DF2['ID'])) else DF1['Value'][row] for row in DF['ID']]

CodePudding user response:

here is one way to to do it using map

# set index on ID in DF2 and map to DF
# replace failed mapping with the value in DF
df['Value']=df['ID'].map(df2.set_index(['ID'])['New Value']).fillna(df['Value'])
df
    ID  Value
0   AA  1.0
1   AB  1.0
2   AC  1.0
3   AD  1.0
4   AE  2.0

CodePudding user response:

You can go straight with merge then ffill

Data:

df1 = pd.DataFrame({'name':['a','b','c'],
                   'val':[1,2,3]})
df2 = pd.DataFrame({'name':['a','c'],
                   'newval':[10,20]})

Merge df1 and df2

df = pd.merge(df1, df2, on='name', how='left')

Now you ffill (forward fill). This means you take two columns val and newval. Any missing value in newval is filled by value in val. The axis=1 means you fill by rows not by column

df[['val', 'newval']] = df[['val', 'newval']].ffill(axis=1)

CodePudding user response:

Given:

# df1

   ID  Value
0  AA      2
1  AB      1
2  AC      2
3  AD      1
4  AE      2

# df2

   ID  New Value
0  AA          1
1  AC          1

Doing:

# Set Indices
df1, df2 = [df.set_index('ID') for df in (df1, df2)]

# Use loc:
df1.loc[df2.index, 'Value'] = df2['New Value']

print(df1.reset_index())

Output:

   ID  Value
0  AA      1
1  AB      1
2  AC      1
3  AD      1
4  AE      2
  • Related