I have a dataset where, whenever I see a certain word that contains specific words, I'd like to match specific values to within a new column.
Data
id status
see-dd-23aaaa33_00 y
see-dd-aaaaa_o00 y
sal-led-sss_0 y
sal-led-sss.AA n
dis-dd-red_0 n
Desired
id status pw
see-dd-2333 y 14
see-dd-aaaaa y 14
sal-led-sss y 8
sal-led-sss n 8
dis-dd-red n 5
Doing
I am thinking I can use a dictionary. Whenever I see a pattern of 'see-dd', I'd like to supply the numerical value of 14. When I see a word that contains 'sal-led' I wish to supply the 8 numerical value. Whenever I see 'dis-dd' I would like to match this with the value of 5.
out= {
'see-dd': 14,
'sal-led': 8,
}
Any suggestion is appreciated.
CodePudding user response:
Try with replace
out= {
'see-dd': 14,
'sal-led': 8,
'dis-dd':5
}
df['new'] = df.id.replace(out,regex=True)
df
id status new
0 see-dd-23aaaa33_00 y 14
1 see-dd-aaaaa_o00 y 14
2 sal-led-sss_0 y 8
3 sal-led-sss.AA n 8
4 dis-dd-red_0 n 5
CodePudding user response:
You can take the below data frame and apply the matcher function based on the pattern dictionary 'out' and 'id' column of the df.
out = {'see-dd': 14, 'sal-led': 8, 'dis-dd': 5}
def matcher(row_data):
for key, val in out.items():
if key in row_data:
return val
#This will create a new column 'pw' using your 'out' patterns and values
df['pw'] = df['id'].apply(matcher)