Home > database >  Regex to get all possible repeated substrings
Regex to get all possible repeated substrings

Time:12-22

I need to find from "aaaa" -> 'aa', 'aa', 'aa', 'aaa', 'aaa', 'aaaa'.

I tried re.findall(r'(.)\1{1,}'), but all I find is 'a'.

CodePudding user response:

Not a regular expression but I think a nested list comprehension like this should do the trick. You can change MIN_LENGTH and MAX_LENGTH if you want different length substrings.

test_str = "aaaa"
MIN_LENGTH, MAX_LENGTH = 2, len(test_str)

substrings = [test_str[i:i   length] for length in range(MIN_LENGTH, MAX_LENGTH    1)
                                     for i in range(len(test_str) - length   1)]
print(substrings)
  • Related