As of version 4.0 R supports a special syntax for raw strings, but, how can it be used in tandem with string interpolation? That could be very useful for passing raw regular expressions. E.g., 123\b
instead of 123\\b
. I've tried using glue:
> tmp = "123\b"
> str_detect("123 4", glue(r"[{tmp}]"))
[1] FALSE
Using a raw string directly does work:
> str_detect("123 4", r"[123\b]")
[1] TRUE
CodePudding user response:
The problem here is that after tmp
is defined, it is too late to have the \b
interpreted as a literal sequence of characters. The character string is stored internally as the byte sequence 31 32 33 08
, not the byte sequence 31 32 33 5c 62
, which is what you would need for your example to work.
If you have existing character strings you wish to use in this way, you need to convert the escape sequences back into literal backslash-character pairs before you use them. One fairly hacky way to do this is to use the console's printing method itself.
As you showed yourself, this doesn't work:
tmp <- "123\b"
charToRaw(tmp)
#> [1] 31 32 33 08
stringr::str_detect("123 4", tmp)
#> [1] FALSE
But if we write a little wrapper around capture.output
, we can get the characters that R needs to replicate the original intended string:
f <- function(x) substr(capture.output(noquote(x)), 5, 1e4)
charToRaw(f(tmp))
#> [1] 31 32 33 5c 62
stringr::str_detect("123 4", f(tmp))
#> [1] TRUE
So the function f
can be thought of as a way of properly catching the string literals. The new raw string input method can't really help here.
Created on 2021-10-24 by the reprex package (v2.0.0)