I have the following dataframe
index,name,score,attempts,qualify
a,Anastasia,12.5,1,yes
b,Dima,9.0,3,no
c,Katherine,16.5,2,yes
d,James,NaN,3,no
e,Emily,9.0,2,no
I am trying to use pandas map function to update name column where name is either James or Emily to any test value 99.
codes = {'James':'99','Emily':'99'}
dff['name'] = dff['name'].map(codes)
dff
I am getting the following output -
index,name,score,attempts,qualify
a,NaN,12.5,1,yes
b,NaN,9.0,3,no
c,NaN,16.5,2,yes
d,99,NaN,3,no
e,99,9.0,2,no
Note that name column values James and Emily have been updated to 99, but the rest of name values are mapped to NaN. How can we ignore indexes which are not intended to be mapped?
CodePudding user response:
The issue is that the map function will apply the dictionary values to all values in the 'name' column, not just the ones specified. To get around this, you can use the replace method instead:
dff['name'] = dff['name'].replace({'James':'99','Emily':'99'})
This will replace only the specified values and leave the others unchanged.
CodePudding user response:
I believe you may be looking for replace
instead of map
.
import pandas as pd
names = pd.Series([
"Anastasia",
"Dima",
"Katherine",
"James",
"Emily"
])
names.replace({"James": "99", "Emily": "99"})
# 0 Anastasia
# 1 Dima
# 2 Katherine
# 3 99
# 4 99
# dtype: object
If you're really set on using map
, then you have to provide a function that knows how to handle every single name it might encounter.
codes = {"James": "99", "Emily": "99"}
# If the lookup into `code` fails,
# return the name that was used for lookup
names.map(lambda name: codes.get(name, name))
CodePudding user response:
You're getting Nan
because you didn't justify what value to give if they don't match! You've to use .apply()
with lambda
.
codes = {'James':'99','Emily':'99'}
dff['name'] = dff.apply(lambda x: codes[x.name] if x.name in codes else x.name
dff
or you can use .replace()
as given by @daniel451 answer:
codes = {'James':'99','Emily':'99'}
dff['name'] = dff.replace(codes)
dff
CodePudding user response:
codes = {'James':'99',
'Emily':'99'}
dff['name'] = dff['name'].replace(codes)
dff
replace() satisfies the requirement -
index,name,score,attempts,qualify
a,Anastasia,12.5,1,yes
b,Dima,9.0,3,no
c,Katherine,16.5,2,yes
d,99,NaN,3,no
e,99,9.0,2,no
CodePudding user response:
You can replace back one way to achiev it
dff['name'] = dff['name'].map(codes).fillna(dff['name'])
CodePudding user response:
codes = {'James':'99','Emily':'99'}
dff['name'] = dff['name'].map(codes).fillna(dff['name'])
dff
index name score attempts qualify
0 a Anastasia 12.5 1 yes
1 b Dima 9.0 3 no
2 c Katherine 16.5 2 yes
3 d 99 NaN 3 no
4 e 99 9.0 2 no