I have a dataframe that looks similar to -
df = DataFrame(data={'ID': ['a','b','c','d'], 'col1':[1,2,3,4], 'col2':[5,6,7,8], 'col3':[9,10,11,12]})
I have a dictionary like this
mapper = {'a':100,'d':3}
Where the key in the dictionary matches the ID in the dataframe, I want to be able to replace the values in say col1 and col3 with the value in the dictionary. Currently I can do this as such
for id, val in mapper.items():
df.loc[df['ID']==id, 'col1']=val
df.loc[df['ID']==id, 'col3']=val
But I'm wondering if there is a vectorised way to do this outside of a for loop as my dataframe is large.
CodePudding user response:
You can use np.where
to do this.
import numpy as np
df["col1"] = np.where(df["ID"].isin(mapper.keys()), df["ID"].map(mapper), df["col1"])
df["col3"] = np.where(df["ID"].isin(mapper.keys()), df["ID"].map(mapper), df["col3"])
np.where
takes condition as first argument, the second argument tells what value to broadcast if True
and third argument tells what value to broadcast if false. If you look at the output of the arguments separately you can understand how it works.
df['ID'].isin(mapper.keys()) # argument 1
# returns
0 True
1 False
2 False
3 True
Name: ID, dtype: bool
df["ID"].map(mapper) # argument 2
# returns
0 100.0
1 NaN
2 NaN
3 3.0
Name: ID, dtype: float64
df["col1"] # argument 3
# returns
0 100.0
1 2.0
2 3.0
3 3.0
Name: col1, dtype: float64
CodePudding user response:
df = df.assign(**{col: df["ID"].map(mapper)
.fillna(df[col])
.astype(int)
for col in ["col1", "col3"]})
Output :
print(df)
ID col1 col2 col3
0 a 100 5 100
1 b 2 6 10
2 c 3 7 11
3 d 3 8 3
CodePudding user response:
map
the values from dict then update
in the corresponding cols
s = df['ID'].map(mapper)
df['col1'].update(s), df['col2'].update(s)
Result
ID col1 col2 col3
0 a 100 100 9
1 b 2 6 10
2 c 3 7 11
3 d 3 3 12