Home > Software engineering >  Regex with no repeated characters
Regex with no repeated characters

Time:07-04

I'm trying to produce a regex that can match every non repeating a, b or c characters (in one and single match)

I did this: ((a|b|c)(?!\2))

Here is the regex101 example: https://regex101.com/r/yJwHOQ/1

It works fine for:

  • a or b or c (single character)
  • bcabca

Now, I want to improve it so it can also match the first repeated character. For example, when I try bcaa, I want to match bca instead of only bc (in a single match).

Is there a way to improve my regex so it can match this case also?

Thanks in advance for your help.

Ps: it's for PCRE but it would be nice to work also with Python

CodePudding user response:

A lookahead is a zero-length assertion. It matches at a position, eg between b and c.

For example in abb after first b there is the same character b. The negative lookahead disallows that and just a gets matched. To even match the character that failed an idea is to reuse [abc].

^(?:([abc])(?!\1))*[abc]

See this demo at regex101

Note: In PCRE a group reference can be used to match the same again, eg (?1) for ([abc]).

  • Related