I'd like to add parentheses around grouped text separated by a comma using stringr
. So if there is text that is separated by one or more commas, then I'd like parentheses around the text. There will always be a "=" before this type of string begins and there will either be a space or nothing (vector ends) after the string. Is there a generalized way to do this? Here's a sample problem:
Sample:
a <- data.frame(Rule = c("A=0 & B=Grp1,Grp2", "A=0 & B=Grp1,Grp3,Grp4 & C=1"))
a
Rule
1 A=0 & B=Grp1,Grp2
2 A=0 & B=Grp1,Grp3,Grp4 & C=1
Desired Output:
Rule
1 A=0 & B=(Grp1,Grp2)
2 A=0 & B=(Grp1,Grp3,Grp4) & C=1
CodePudding user response:
Here is another potential solution. I have altered the example input to show that it works with multiple "Grp's" per line:
library(stringr)
a <- data.frame(Rule = c("A=0 & B=Grp1,Grp2",
"A=0 & B=Grp1,Grp3,Grp4 & C=1 & D=Grp5,Grp6"))
str_replace_all(a$Rule, "=([^, &] ,[^ $] )", "=(\\1)")
#> [1] "A=0 & B=(Grp1,Grp2)"
#> [2] "A=0 & B=(Grp1,Grp3,Grp4) & C=1 & D=(Grp5,Grp6)"
Created on 2022-11-23 by the reprex package (v2.0.1)
Explanation:
regex = "=([^, &] ,[^ $] )", "=(\\1)"
=(
starting with an equals sign, capture a group
[^, &] ,
with one or more characters that aren't ",", " ", and "&" followed by a comma
[^ $] )
followed by one or more characters that aren't " " or the end of the line ("$")
=(\\1)
then replace the equals sign and add parentheses around the captured group (e.g. the Grp1,Grp2)
CodePudding user response:
This should work:
Find: (([A-Za-z\d] ,) [A-Za-z\d] )
Replace: ($1)
Explanation:
[A-Za-z\d]
is any alphanumeric character.
The inner group looks for 1 or more copies of groups of alphanum characters separated by commas. (e.g. Abcd1,Abcd2,
)
The outer group then looks for the closing alphanumeric group, which doesn't have a comma after it. (e.g. Abcd3
)
These are concatenated then the whole group is captured.
Last thing to do is the replacement, which is pretty self explanatory.