Home > Net >  Unable to use the upper method with regex sub
Unable to use the upper method with regex sub

Time:11-05

The program should accept a text and replace the actual names of certain people with their initials. It accepts two inputs:

  1. Text(example) as string1

Doctor Sara amity met agent Binod Rastogi on 12th March 1999. Agent smike smith and doctor amy wills along with agent Tom edwards were also in the secret meeting.

  1. People to hide as cens_element:

Doctor/ Agent (case insensitive)

import re

string1 = input('Enter the text to censor the names in:')
print('=========================')
print('\n whose names do you want to censor?')
cens_element = input()

regobj = re.compile(f'{cens_element} (\w)\w* (\w)\w*', re.IGNORECASE) 
final = regobj.sub('%s '%(cens_element) (r'\1.'.upper()) ' ' (r'\2.'.upper()), string1)
#\1 and \2 for the first and second groups for name initials (\w) respectively.
print('\n')
print('='*len(cens_element))
print('\n')
print(final)

The upper() methods in the assignment of 'final' variable don't seem to work as they don't capitalize the initials

the output: (cens_element= doctor)

doctor S. a. met agent Binod Rastogi on 12th March 1999. Agent smike smith and doctor a. w. along with agent Tom edwards were also in the secret meeting.

expected output

doctor S. A. met agent Binod Rastogi on 12th March 1999. Agent smike smith and doctor A. W. along with agent Tom edwards were also in the secret meeting.

How do I get the desired output?

CodePudding user response:

The problem with your .upper() is that it applies to the string such as '\1' before substitution occurs.

For this kind of fancy formatting you can use a 'callback function' for the sub method rather than a string, which allows lots of flexibility in formatting the substituted text.

Example:

import re

text = """
Doctor Sara Amity met agent Binod Rastogi on 12th March 1999. 
Agent smike smith and doctor amy wills along with agent Tom edwards were also in the secret meeting.
"""
title = 'Doctor'

ptrn = re.compile(f'{title} (\w)\w* (\w)\w*',re.IGNORECASE)

def replfunc(matchobj):
    grps = matchobj.groups()
    return f'{title} {grps[0].upper()}. {grps[1].upper()}.'

res = ptrn.sub(replfunc,text,count=0)
print(res)

the result:

Doctor S. A. met agent Binod Rastogi on 12th March 1999. Agent smike smith and Doctor A. W. along with agent Tom edwards were also in the secret meeting.

  • Related