Home > OS >  Ignore Pandas Dataframe indexes which are not intended to be mapped using map function
Ignore Pandas Dataframe indexes which are not intended to be mapped using map function

Time:12-13

I have the following dataframe

index,name,score,attempts,qualify
a,Anastasia,12.5,1,yes
b,Dima,9.0,3,no
c,Katherine,16.5,2,yes
d,James,NaN,3,no
e,Emily,9.0,2,no

I am trying to use pandas map function to update name column where name is either James or Emily to any test value 99.

codes = {'James':'99','Emily':'99'}
dff['name'] = dff['name'].map(codes)
dff

I am getting the following output -

index,name,score,attempts,qualify
a,NaN,12.5,1,yes
b,NaN,9.0,3,no
c,NaN,16.5,2,yes
d,99,NaN,3,no
e,99,9.0,2,no

Note that name column values James and Emily have been updated to 99, but the rest of name values are mapped to NaN. How can we ignore indexes which are not intended to be mapped?

CodePudding user response:

The issue is that the map function will apply the dictionary values to all values in the 'name' column, not just the ones specified. To get around this, you can use the replace method instead:

dff['name'] = dff['name'].replace({'James':'99','Emily':'99'})

This will replace only the specified values and leave the others unchanged.

CodePudding user response:

I believe you may be looking for replace instead of map.

import pandas as pd
names = pd.Series([
    "Anastasia",
    "Dima",
    "Katherine",
    "James",
    "Emily"
])

names.replace({"James": "99", "Emily": "99"})

 
# 0    Anastasia
# 1         Dima
# 2    Katherine
# 3           99
# 4           99
# dtype: object

If you're really set on using map, then you have to provide a function that knows how to handle every single name it might encounter.

codes = {"James": "99", "Emily": "99"}

# If the lookup into `code` fails,
# return the name that was used for lookup
names.map(lambda name: codes.get(name, name))

CodePudding user response:

You're getting Nan because you didn't justify what value to give if they don't match! You've to use .apply() with lambda.


codes = {'James':'99','Emily':'99'}

dff['name'] = dff.apply(lambda x: codes[x.name] if x.name in codes else x.name


dff

or you can use .replace() as given by @daniel451 answer:


codes = {'James':'99','Emily':'99'}

dff['name'] = dff.replace(codes)

dff

CodePudding user response:

codes = {'James':'99',
         'Emily':'99'}
dff['name'] = dff['name'].replace(codes)
dff

replace() satisfies the requirement -

index,name,score,attempts,qualify
a,Anastasia,12.5,1,yes
b,Dima,9.0,3,no
c,Katherine,16.5,2,yes
d,99,NaN,3,no
e,99,9.0,2,no

CodePudding user response:

You can replace back one way to achiev it

dff['name'] = dff['name'].map(codes).fillna(dff['name'])

CodePudding user response:

codes = {'James':'99','Emily':'99'}
dff['name'] = dff['name'].map(codes).fillna(dff['name'])

dff

    index   name        score   attempts    qualify
0   a       Anastasia   12.5    1           yes
1   b       Dima        9.0     3           no
2   c       Katherine   16.5    2           yes
3   d       99          NaN     3           no
4   e       99          9.0     2           no
  • Related