Home > OS >  Apply dictionary to column with multiple values
Apply dictionary to column with multiple values

Time:09-17

Is it possible to apply a dictionary to a column with multiple values?

    Unit    Code
0   243     A
1   106     B
2   678     C
3   852     A, B
4   239     A, C
5   217     A, B, C
6   642     D
7   932     E, F, G, H
8   213     A, C, E, H

If my dictionary is like the following...

    dictex = {
        'A' : 'Alpha',
        'B' : 'Bravo',
        'C' : 'Charlie',
        'D' : 'Delta',
        'E' : 'Echo',
        'F' : 'Foxtrot',
        'G' : 'Golf',
        'H' : 'Hotel',
        None : 'NULL'
    }

After I apply it...

df['Phonetic'] = df['Code'].map(dictex) 

I get the following (which is not what I want).

    Unit    Code         Phonetic
0   243     A            Alpha
1   106     B            Bravo
2   678     C            Charlie
3   852     A, B         NaN
4   239     A, C         NaN
5   217     A, B, C      NaN
6   642     D            Delta
7   932     E, F, G, H   NaN
8   213     A, C, E, H   NaN

What I want is the following.

    Unit    Code        Phonetic
0   243     A           Alpha
1   106     B           Bravo
2   678     C           Charlie
3   852     A, B        Alpha, Bravo
4   239     A, C        Alpha, Charlie
5   217     A, B, C     Alpha, Bravo, Charlie
6   642     D           Delta
7   932     E, F, G, H  Echo, Foxtrot, Golf, Hotel
8   213     A, C, E, H  Alpha, Charlie, Echo, Hotel

How can I map the dictionary onto each value within the column?

CodePudding user response:

Try:

df["Phonetic"] = df["Code"].apply(
    lambda x: ", ".join(dictex.get(v) for v in map(str.strip, x.split(",")))
)
print(df)

Prints:

   Unit        Code                     Phonetic
0   243           A                        Alpha
1   106           B                        Bravo
2   678           C                      Charlie
3   852        A, B                 Alpha, Bravo
4   239        A, C               Alpha, Charlie
5   217     A, B, C        Alpha, Bravo, Charlie
6   642           D                        Delta
7   932  E, F, G, H   Echo, Foxtrot, Golf, Hotel
8   213  A, C, E, H  Alpha, Charlie, Echo, Hotel

CodePudding user response:

You need to split the data before mapping:

df['Phonetic'] = (df['Code'].str.split(', ')
                     .explode().map(dictex)
                     .groupby(level=0).agg(', '.join)
                 )

Or similarly with str.extractall:

df['Phonetic'] = (df['Code'].str.extractall('([^\s,] )')
                  .replace(dictex)
                  .groupby(level=0).agg(', '.join)
                 )

Output:

   Unit        Code                     Phonetic
0   243           A                        Alpha
1   106           B                        Bravo
2   678           C                      Charlie
3   852        A, B                 Alpha, Bravo
4   239        A, C               Alpha, Charlie
5   217     A, B, C        Alpha, Bravo, Charlie
6   642           D                        Delta
7   932  E, F, G, H   Echo, Foxtrot, Golf, Hotel
8   213  A, C, E, H  Alpha, Charlie, Echo, Hotel

CodePudding user response:

Let us do get_dummies

s = df.Code.str.get_dummies(', ').rename(dictex,axis=1)
df['Phonetic'] = s.dot(s.columns ',').str[:-1]
df
Out[68]: 
      Code             Phonetic
0        A                Alpha
1        B                Bravo
2        C              Charlie
3     A, B          Alpha,Bravo
4  A, B, C  Alpha,Bravo,Charlie
  • Related