I need to write a regular expression that defines a language consisting of all strings of 0s and 1s that contains exactly one substring 011.
This is my approach: 1* 0* (10*)* 011(1)* (0 10)*. I do not know how to fix this, also there is probably more efficient way to solve this..
011 Accept
111000101001011 Accept
110011 Accept
1101011 Accept
110011010 Accept
011011 Accept //it is accepting this one, which is wrong
111111 Reject
101011 Accept
0001011 Accept
CodePudding user response:
Attempt without lookarounds:
^(?>[01]*?011)([01]*?011)? [01]*(?(1)(*FAIL)|)$
Explanation:
^ Match the start of the string
(?>[01]*?011) Match everything up to and including the first 011, without allowing backtracking
([01]*?011)? Match everything up to and including the second 011,
[01]* Match remaining 1's and 0's
(?(1)(*FAIL)|) If the second 011 was found, report a failure
$ Match the end of the string
CodePudding user response:
If supported, you can use a negative lookahead asserting not 2 repetitions of 001.
If that is the case, match 011
between optional repetitions of either 0 or 1 using a character class [01]*
^(?![01]*?011[01]*?011)[01]*011[01]*$
See a demo on regex101 for the matches.