Home > Software design >  Regex to match repeating 3 digit groups that aren't made up of 3 identical digits
Regex to match repeating 3 digit groups that aren't made up of 3 identical digits

Time:12-29

I am trying to match repeating 3 digit groups that appear in a UK phone number. I already can get the match when the 3 digits are identical in the groups with this pattern: r'(\d)\1{2}'

E.g. when input is "07119777777" I get two matches:

<re.Match object; span=(5, 8), match='777'>
<re.Match object; span=(8, 11), match='777'>

However when input is something like "07123123123" I get no matches as the digits inside the 3 digit group are different. Is there a regex pattern to identify these as matches?

CodePudding user response:

Would you please try the following:

str = "07590759759"
m = re.search(r'(\d{3}).*?(\1).*?(\1)', str)
print(m.groups())

Output:

('759', '759', '759')

[Update]
Answering your additional question, I suppose it is not possible with regex only. (It may be, but I cannot figure out so far.)
Here is an approach:

str = "07590759759012759"
m = re.search(r'(\d{3})((?:.*?\1) )', str)
m1, m2 = m.groups()                     # m1: first match, m2: remaining sequence of matches
m3 = re.findall(m1, m2)                 # extract multiple m1's out of m2
print([m1]   m3)                        # concatenate them as a list

Output:

['075', '075']

The output may not be what you expect because the code above outputs two or more repetitions. If you want to limit up to three or more repetitions, modify the re.search() line as:

m = re.search(r'(\d{3})((?:.*?\1){2,})', str)

Then the output will be:

['759', '759', '759', '759']
  • Related