Is it possible to apply a dictionary to a column with multiple values?
Unit Code
0 243 A
1 106 B
2 678 C
3 852 A, B
4 239 A, C
5 217 A, B, C
6 642 D
7 932 E, F, G, H
8 213 A, C, E, H
If my dictionary is like the following...
dictex = {
'A' : 'Alpha',
'B' : 'Bravo',
'C' : 'Charlie',
'D' : 'Delta',
'E' : 'Echo',
'F' : 'Foxtrot',
'G' : 'Golf',
'H' : 'Hotel',
None : 'NULL'
}
After I apply it...
df['Phonetic'] = df['Code'].map(dictex)
I get the following (which is not what I want).
Unit Code Phonetic
0 243 A Alpha
1 106 B Bravo
2 678 C Charlie
3 852 A, B NaN
4 239 A, C NaN
5 217 A, B, C NaN
6 642 D Delta
7 932 E, F, G, H NaN
8 213 A, C, E, H NaN
What I want is the following.
Unit Code Phonetic
0 243 A Alpha
1 106 B Bravo
2 678 C Charlie
3 852 A, B Alpha, Bravo
4 239 A, C Alpha, Charlie
5 217 A, B, C Alpha, Bravo, Charlie
6 642 D Delta
7 932 E, F, G, H Echo, Foxtrot, Golf, Hotel
8 213 A, C, E, H Alpha, Charlie, Echo, Hotel
How can I map the dictionary onto each value within the column?
CodePudding user response:
Try:
df["Phonetic"] = df["Code"].apply(
lambda x: ", ".join(dictex.get(v) for v in map(str.strip, x.split(",")))
)
print(df)
Prints:
Unit Code Phonetic
0 243 A Alpha
1 106 B Bravo
2 678 C Charlie
3 852 A, B Alpha, Bravo
4 239 A, C Alpha, Charlie
5 217 A, B, C Alpha, Bravo, Charlie
6 642 D Delta
7 932 E, F, G, H Echo, Foxtrot, Golf, Hotel
8 213 A, C, E, H Alpha, Charlie, Echo, Hotel
CodePudding user response:
You need to split the data before mapping:
df['Phonetic'] = (df['Code'].str.split(', ')
.explode().map(dictex)
.groupby(level=0).agg(', '.join)
)
Or similarly with str.extractall
:
df['Phonetic'] = (df['Code'].str.extractall('([^\s,] )')
.replace(dictex)
.groupby(level=0).agg(', '.join)
)
Output:
Unit Code Phonetic
0 243 A Alpha
1 106 B Bravo
2 678 C Charlie
3 852 A, B Alpha, Bravo
4 239 A, C Alpha, Charlie
5 217 A, B, C Alpha, Bravo, Charlie
6 642 D Delta
7 932 E, F, G, H Echo, Foxtrot, Golf, Hotel
8 213 A, C, E, H Alpha, Charlie, Echo, Hotel
CodePudding user response:
Let us do get_dummies
s = df.Code.str.get_dummies(', ').rename(dictex,axis=1)
df['Phonetic'] = s.dot(s.columns ',').str[:-1]
df
Out[68]:
Code Phonetic
0 A Alpha
1 B Bravo
2 C Charlie
3 A, B Alpha,Bravo
4 A, B, C Alpha,Bravo,Charlie