Let’s say I have the following Pandas dataframe, where the 'key'
column only contains unique strings:
import pandas as pd
df = pd.DataFrame({'key':['b','d','c','a','e','f'], 'value': [0,0,0,0,0,0]})
df
key value
0 b 0
1 d 0
2 c 0
3 a 0
4 e 0
5 f 0
Now I have a list of unique keys and a list of corresponding values:
keys = ['a', 'b', 'c', 'd']
values = [1, 2, 3, 4]
I want to update the 'value'
column in the same order of the lists, so that each row has matched 'key'
and 'value'
(a
to 1
, 'b'
to 2
, 'c'
to 3
, 'd'
to 4
). I am using the following code, but the dataframe seems to update values from top to bottom, which I don't quite understand
df.loc[df['key'].isin(keys),'value'] = values
df
key value
0 b 1
1 d 2
2 c 3
3 a 4
4 e 0
5 f 0
To be clear, I am expecting to get
key value
0 b 2
1 d 4
2 c 3
3 a 1
4 e 0
5 f 0
Any suggestions?
CodePudding user response:
Use map
:
dd = dict(zip(keys, values))
df['value'] = df['key'].map(dd).fillna(df['value'])
CodePudding user response:
keys = ['a', 'b', 'c', 'd']
values = [1, 2, 3, 4]
# form a dictionary with keys and values list
d=dict(zip(keys, values))
# update the value where mapping exists using LOC and MAP
df.loc[df['key'].map(d).notna(), 'value'] =df['key'].map(d)
df
key value
0 b 2
1 d 4
2 c 3
3 a 1
4 e 0
5 f 0
CodePudding user response:
with a temporary dataframe:
temp_df = df.set_index('key')
temp_df.loc[keys] = np.array(values).reshape(-1, 1)
df = temp_df.reset_index()