I'm relatively new to regex, so bear with me if the question is trivial. I'd like to place a comma between every letter of a string using regex, e.g.:
x <- "ABCD"
I want to get
"A,B,C,D"
It would be nice if I could do that using gsub
, sub
or related on a vector of strings of arbitrary number of characters.
I tried
> sub("(\\w)", "\\1,", x)
[1] "A,BCD"
> gsub("(\\w)", "\\1,", x)
[1] "A,B,C,D,"
> gsub("(\\w)(\\w{1})$", "\\1,\\2", x)
[1] "ABC,D"
CodePudding user response:
Try:
x <- 'ABCD'
gsub('\\B', ',', x, perl = T)
Prints:
[1] "A,B,C,D"
Might have misread the query; OP is looking to add comma's between letters only. Therefor try:
gsub('(\\p{L})(?=\\p{L})', '\\1,', x, perl = T)
(\p{L})
- Match any kind of letter from any language in a 1st group;(?=\p{L})
- Positive lookahead to match as per above.
We can use the backreference to this capture group in the replacement.
CodePudding user response:
You can use
> gsub("(.)(?=.)", "\\1,", x, perl=TRUE)
[1] "A,B,C,D"
The (.)(?=.)
regex matches any char capturing it into Group 1 (with (.)
) that must be followed with any single char ((?=.)
) is a positive lookahead that requires a char immediately to the right of the current location).
Vriations of the solution:
> gsub("(.)(?!$)", "\\1,", x, perl=TRUE)
## Or with stringr:
## stringr::str_replace_all(x, "(.)(?!$)", "\\1,")
[1] "A,B,C,D"
Here, (?!$)
fails the match if there is an end of string position.
See the R demo online:
x <- "ABCD"
gsub("(.)(?=.)", "\\1,", x, perl=TRUE)
# => [1] "A,B,C,D"
gsub("(.)(?!$)", "\\1,", x, perl=TRUE)
# => [1] "A,B,C,D"
stringr::str_replace_all(x, "(.)(?!$)", "\\1,")
# => [1] "A,B,C,D"
CodePudding user response:
A non-regex friendly answer:
paste(strsplit(x, "")[[1]], collapse = ",")
#[1] "A,B,C,D"
CodePudding user response:
Another option is to use positive look behind and look ahead to assert there is a preceding and a following character:
library(stringr)
str_replace_all(x, "(?<=.)(?=.)", ",")
[1] "A,B,C,D"