I have a dataset where, whenever I see a certain word that contains specific words, I'd like to match specific values to within a new column.
Data
id status
see-dd-2333 y
see-dd-aaaaa y
sal-led-sss y
sal-led-sss n
dis-dd-red n
Desired
id status pw
see-dd-2333 y 14
see-dd-aaaaa y 14
sal-led-sss y 8
sal-led-sss n 8
dis-dd-red n 5
Doing
I am thinking I can use a dictionary. Whenever I see a pattern of 'see-dd', I'd like to supply the numerical value of 14. When I see a word that contains 'sal-led' I wish to supply the 8 numerical value. Whenever I see 'dis-dd' I would like to match this with the value of 5.
out= {
'see-dd': 14,
'sal-led': 8,
}
Any suggestion is appreciated.
CodePudding user response:
The simplest would be to use the replace
method. As the doc noted:
This method has a lot of options. You are encouraged to experiment and play with this method to gain intuition about how it works.
df['id'].replace(regex=out)
0 14
1 14
2 8
3 8
4 5
Name: id, dtype: int64
with out
as:
out= {
'see-dd': 14,
'sal-led': 8,
'dis-dd': 5
}
df['pw'] = df['id'].replace(regex=out)
df
id status pw
0 see-dd-2333 y 14
1 see-dd-aaaaa y 14
2 sal-led-sss y 8
3 sal-led-sss n 8
4 dis-dd-red n 5
CodePudding user response:
You can also use:
df['pw'] = df['id'].str.rsplit('-', 1).str.get(0).map(out)
Output:
id status pw
0 see-dd-2333 y 14
1 see-dd-aaaaa y 14
2 sal-led-sss y 8
3 sal-led-sss n 8
4 dis-dd-red n 5