Home > Blockchain >  Why don't regular expression engines support all set operations?
Why don't regular expression engines support all set operations?

Time:10-02

Similar to this question on a private forum: FA construction showing closure under intersection Citation:

Scott Aaronson. 6.045J Automata, Computability, and Complexity. Spring 2011. Massachusetts Institute of Technology: MIT OpenCourseWare, https://ocw.mit.edu. License: Creative Commons BY-NC-SA.

CodePudding user response:

Regex engines that support lookarounds let you implement intersection, complement and difference :

  • (?=pattern1)pattern2 will match a string that both pattern1 and pattern2 match
  • (?!pattern).* will match anything that isn't matched by pattern (although more realistically you'd use pattern as a regex and have your higher-level environment reverse the match result)
  • (?!pattern1)pattern2 will match a string that is matched by pattern2 but not by pattern1

However lookarounds are a rather recent feature in the history of regular expressions and still aren't supported by many regex engines. Why is that?
I'm not well versed in regex's history, but if I believe a cursory glance at Wikipedia articles, they originate from mathematician Stephen Cole Kleene's definition of regular languages which is only based on the union, concatenation and Kleene star operations, which might explain why those are the basic operations in regular expressions.

  • Related