Home > front end >  add a dot at 3rd position of the string with the help of regex in pandas
add a dot at 3rd position of the string with the help of regex in pandas

Time:07-16

here is a sample table of the output I got while running this code

df['formatted_codes']=df['dx_code'].str.replace(r'(^\w{3}(?!$))',r'\1.',regex=True)

dx_id dx_code formatted_codes
1 A00 A00.
2 A000 A00.0
3 A001 A00.1
4 A009 A00.9
5 A01 A01.
6 S92113 S92.113
7 S92113D S92.113D

but I want the '.' to apply only for characters more than 3 the output I want is like this


dx_id dx_code formatted_codes
1 A00 A00
2 A000 A00.0
3 A001 A00.1
4 A009 A00.9
5 A01 A01
6 S92113 S92.113
7 S92113D S92.113D

so if anyone can help me with adjusting the regex code that would be helpful or if there is other way for add '.' at my desired location do tell

CodePudding user response:

Use str.rstrip to remove trailing dots from the formatted_codes column:

df["formatted_codes"] = df["formatted_codes"].str.rstrip('.')

CodePudding user response:

You need to use

df['formatted_codes']=df['dx_code'].str.replace(r'\w{3}(?!$)', r'\g<0>.', regex=True)

See the regex demo.

The \w{3}(?!$) regex finds three consecutive word chars that are not at the start of string and replaces the found text with the same text (the \g<0> backreference refers to the whole match value, no need for any extra capturing group around the whole pattern) and a dot char.

  • Related