Home > Software design >  Regex to add comma between any character
Regex to add comma between any character

Time:11-01

I'm relatively new to regex, so bear with me if the question is trivial. I'd like to place a comma between every letter of a string using regex, e.g.:

x <- "ABCD"

I want to get

"A,B,C,D"

It would be nice if I could do that using gsub, sub or related on a vector of strings of arbitrary number of characters.

I tried

> sub("(\\w)", "\\1,", x)
[1] "A,BCD"
> gsub("(\\w)", "\\1,", x)
[1] "A,B,C,D,"
> gsub("(\\w)(\\w{1})$", "\\1,\\2", x)
[1] "ABC,D"

CodePudding user response:

Try:

x <- 'ABCD'
gsub('\\B', ',', x, perl = T)

Prints:

[1] "A,B,C,D"

Might have misread the query; OP is looking to add comma's between letters only. Therefor try:

gsub('(\\p{L})(?=\\p{L})', '\\1,', x, perl = T)
  • (\p{L}) - Match any kind of letter from any language in a 1st group;
  • (?=\p{L}) - Positive lookahead to match as per above.

We can use the backreference to this capture group in the replacement.

CodePudding user response:

You can use

> gsub("(.)(?=.)", "\\1,", x, perl=TRUE)
[1] "A,B,C,D"

The (.)(?=.) regex matches any char capturing it into Group 1 (with (.)) that must be followed with any single char ((?=.)) is a positive lookahead that requires a char immediately to the right of the current location).

Vriations of the solution:

> gsub("(.)(?!$)", "\\1,", x, perl=TRUE)
## Or with stringr:
## stringr::str_replace_all(x, "(.)(?!$)", "\\1,")
[1] "A,B,C,D"

Here, (?!$) fails the match if there is an end of string position.

See the R demo online:

x <- "ABCD"
gsub("(.)(?=.)", "\\1,", x, perl=TRUE)
# => [1] "A,B,C,D"
gsub("(.)(?!$)", "\\1,", x, perl=TRUE)
# => [1] "A,B,C,D"
stringr::str_replace_all(x, "(.)(?!$)", "\\1,")
# => [1] "A,B,C,D"

CodePudding user response:

A non-regex friendly answer:

paste(strsplit(x, "")[[1]], collapse = ",")
#[1] "A,B,C,D"

CodePudding user response:

Another option is to use positive look behind and look ahead to assert there is a preceding and a following character:

library(stringr)
str_replace_all(x, "(?<=.)(?=.)", ",")
[1] "A,B,C,D"
  • Related