Home > Mobile >  pandas apply a dictionary mapping to each value in a list
pandas apply a dictionary mapping to each value in a list

Time:03-04

I'm trying to convert the following

    index                   day          items  size
0     0.0                     1         [8, 9]     2
1     1.0                     2            [1]     1
2     2.0                     3            [4]     1
3     3.0                     4            [3]     1
4     4.0                     5            [4]     1
5     5.0                     6            [4]     1
6     6.0                     7            [4]     1
7     7.0                     8            [4]     1
8     8.0                     9         [3, 8]     2
9     9.0                    10         [3, 8]     2
10   10.0                    11         [3, 5]     2

using a mapping:

mapping_dict = { 1: "First", 2: "Second", 3: "Third" # and so on }

so that each value in the lists in items get their replacement value:

    index                   day         items                 size
0     0.0                     1         ["Eighth", "Ninth"]     2
1     1.0                     2         ["First"]               1

Similar to above. I tried using apply with a lambda, but each lambda will return the entire list in the row.

CodePudding user response:

Use lambda function in list comprehension:

#if no match return same value
df['items'] = df['items'].apply(lambda x: [mapping_dict.get(y, y) for y in x])

#if no match return None
df['items'] = df['items'].apply(lambda x: [mapping_dict.get(y) for y in x])

#if no match remove
df['items'] = df['items'].apply(lambda x: [mapping_dict[y] for y in x if y in mapping_dict])

Sample:

mapping_dict = { 1: "First", 2: "Second", 3: "Third"}
             
df['items1']= df['items'].apply(lambda x: [mapping_dict.get(y, y) for y in x])
df['items2']= df['items'].apply(lambda x: [mapping_dict.get(y) for y in x])
df['items3']= df['items'].apply(lambda x: [mapping_dict[y] for y in x if y in mapping_dict])
print (df)
   index  day   items  size       items1          items2    items3
0    0.0    1  [8, 2]     2  [8, Second]  [None, Second]  [Second]
1    1.0    2     [1]     1      [First]         [First]   [First]
2    2.0    3     [4]     1          [4]          [None]        []

CodePudding user response:

You can use a list comprehension, which will be the fastest method:

df['items'] = [[mapping_dict.get(e, e) for e in l] for l in df['items']]

or, less efficiently:

(df.explode('items')
   .assign(items=lambda d: d['items'].map(mapping_dict))
   .groupby(level=0).agg({'items': list, 'index': 'first', 'day': 'first', 'size': 'first'})
)

output:

    index  day            items  size
0     0.0    1  [Eighth, Ninth]     2
1     1.0    2          [First]     1
2     2.0    3         [Fourth]     1
3     3.0    4          [Third]     1
4     4.0    5         [Fourth]     1
5     5.0    6         [Fourth]     1
6     6.0    7         [Fourth]     1
7     7.0    8         [Fourth]     1
8     8.0    9  [Third, Eighth]     2
9     9.0   10  [Third, Eighth]     2
10   10.0   11   [Third, Fifth]     2
  • Related