Home > Mobile >  re.escape() equivalent in Julia?
re.escape() equivalent in Julia?

Time:10-17

I have a bunch of abbreviations I'd like to use in RegEx matches, but they contain lots of regex reserved characters (like . ? $). In Python you're able to return an escaped (regex safe) string using re.escape. For example:

re.escape("Are U.S. Pythons worth any $$$?") will return 'Are\\ U\\.S\\.\\ Pythons\\ worth\\ any\\ \\$\\$\\$\\?'

From my (little) experience with Julia so far, I can tell there's probably a much more straightforward way of doing this in Julia, by I couldn't find any previous answers on the topic.

CodePudding user response:

Julia uses the PCRE2 library underneath, and uses its regex-quoting syntax to automatically escape special characters when you join a Regex with a normal String. For eg.

julia> r"\w \s*" * raw"Are U.S. Pythons worth any $$$?"
r"(?:\w \s*)\QAre U.S. Pythons worth any $$$?\E"

Here we've used a raw string to make sure that none of the characters are interpreted as special, including the $s.

If we needed interpolation, we can also use a normal String literal instead. In this case, the interpolation will be done, and then the quoting with \Q ... \E.

julia> snake = "Python"
"Python"

julia> r"\w \s*" * "Are U.S. $snake worth any money?"
r"(?:\w \s*)\QAre U.S. Python worth any money?\E"

So you can place the part of the regex you wish to be quoted in a normal String, and they'll be quoted automatically when you join them up with a Regex.

You can even do it directly within the regex yourself - \Q starts a region where none of the regex-special characters are interpreted as special, and \E ends that region. Everything within such a region is treated literally by the regex engine.

  • Related