Home > Enterprise >  How to get continuous repetition of a pattern in a string as a single element
How to get continuous repetition of a pattern in a string as a single element

Time:08-05

I am trying to replace the multiple simultaneous repetition of patterns. Suppose following is the text with the pattern user.

user user please sort situation time bound manner present officer ha moral courage to interest of public of expect political leadership order always absolutely right thing http

Here I want such occurrence of user in this case user user to be replaced with user . Condition being that the user repetition should be adjacent to each other.

For example if the sentence was: user user user please user user sort situation time bound manner present officer ha moral courage to interest of public of expect political leadership order always absolutely right thing http I want user user user and user user each to be replaced with user

What I have come up with so far is this:

re.findall(r'[user\s] ',text) I know to replace we will use re.sub

The output that I am getting is:

['user user user ',
 'e',
 'se user user s',
 'r',
 ' ',
 'u',
 ' ',
 ' ',
 ' s',
 'u',
 ' ',
 'e ',
 'u',
 ' ',
 'er ',
 'rese',
 ' ',
 'er ',
 ' ',
 'r',
 ' ',
 'ur',
 'e ',
 ' ',
 'eres',
 ' ',
 ' ',
 'u',
 ' ',
 ' ',
 ' e',
 'e',
 ' ',
 ' ',
 'e',
 'ers',
 ' ',
 'r',
 'er ',
 's ',
 's',
 'u',
 'e',
 ' r',
 ' ',
 ' ']

So I just want the first and the third element to be found and third element should be user user instead of se user user s

So when you answer please could you explain how would that expression work. I am very new to regex.

CodePudding user response:

I hope I've understand your question right. This will shorten user user to user (even for more repetitions):

import re

s = "user user user please user user sort situation time bound manner present officer ha moral courage to interest of public of expect political leadership order always absolutely right thing http"

s = re.sub(
    r"(?:(\suser\b)|(\buser\s)){2,}",
    lambda g: " user" if g.group(1) else "user ",
    s,
)
print(s)

Prints:

user please user sort situation time bound manner present officer ha moral courage to interest of public of expect political leadership order always absolutely right thing http
  • Related