Home > Software design >  Regex to add comma between any character
Regex to add comma between any character


I'm relatively new to regex, so bear with me if the question is trivial. I'd like to place a comma between every letter of a string using regex, e.g.:

x <- "ABCD"

I want to get


It would be nice if I could do that using gsub, sub or related on a vector of strings of arbitrary number of characters.

I tried

> sub("(\\w)", "\\1,", x)
[1] "A,BCD"
> gsub("(\\w)", "\\1,", x)
[1] "A,B,C,D,"
> gsub("(\\w)(\\w{1})$", "\\1,\\2", x)
[1] "ABC,D"

CodePudding user response:


x <- 'ABCD'
gsub('\\B', ',', x, perl = T)


[1] "A,B,C,D"

Might have misread the query; OP is looking to add comma's between letters only. Therefor try:

gsub('(\\p{L})(?=\\p{L})', '\\1,', x, perl = T)
  • (\p{L}) - Match any kind of letter from any language in a 1st group;
  • (?=\p{L}) - Positive lookahead to match as per above.

We can use the backreference to this capture group in the replacement.

CodePudding user response:

You can use

> gsub("(.)(?=.)", "\\1,", x, perl=TRUE)
[1] "A,B,C,D"

The (.)(?=.) regex matches any char capturing it into Group 1 (with (.)) that must be followed with any single char ((?=.)) is a positive lookahead that requires a char immediately to the right of the current location).

Vriations of the solution:

> gsub("(.)(?!$)", "\\1,", x, perl=TRUE)
## Or with stringr:
## stringr::str_replace_all(x, "(.)(?!$)", "\\1,")
[1] "A,B,C,D"

Here, (?!$) fails the match if there is an end of string position.

See the R demo online:

x <- "ABCD"
gsub("(.)(?=.)", "\\1,", x, perl=TRUE)
# => [1] "A,B,C,D"
gsub("(.)(?!$)", "\\1,", x, perl=TRUE)
# => [1] "A,B,C,D"
stringr::str_replace_all(x, "(.)(?!$)", "\\1,")
# => [1] "A,B,C,D"

CodePudding user response:

A non-regex friendly answer:

paste(strsplit(x, "")[[1]], collapse = ",")
#[1] "A,B,C,D"

CodePudding user response:

Another option is to use positive look behind and look ahead to assert there is a preceding and a following character:

str_replace_all(x, "(?<=.)(?=.)", ",")
[1] "A,B,C,D"
  • Related