I am trying to improve the readability of automated text generation based on a database query.
is there a neat way to perform these substitutions ? To do the following in 1 command instead of 6?
x<-c("Te( )st", "Test()", "Test ()", "Test ( )", "Test ,,", "Test,, ", "Test , ")
out<-c("Test", "Test", "Test", "Test", "Test,", "Test, ", "Test,")
x<-gsub(pattern = "( ", replacement = "(", x, fixed = T)
x<-gsub(pattern = " )", replacement = ")", x, fixed = T)
x<-gsub(pattern = " ,", replacement = ",", x, fixed = T)
x<-gsub(pattern = "()", replacement = "", x, fixed = T)
x<-gsub(pattern = ",,", replacement = ",", x, fixed = T)
x<-gsub(pattern = " ,", replacement = ",", x, fixed = T)
CodePudding user response:
You can use
x<-c("Te( )st", "Test()", "Test ()", "Test ( )", "Test ,,", "Test,, ", "Test , ")
gsub("\\(\\s*\\)|\\s (?=[,)])|(?<=\\()\\s |(,), ", "\\1", x, perl=TRUE)
# => [1] "Test" "Test" "Test " "Test " "Test," "Test, " "Test, "
See the R demo online and the regex demo. Details:
\(\s*\)|
-(
, zero or more whitespaces and then a)
, or\s (?=[,)])|
- one or more whitespaces and then either,
or)
, or(?<=\()\s |
- one or more whitespaces immediately preceded with a(
char, or(,),
- a comma captured into Group 1 and then one or more commas.
The replacement is the Group 1 value, namely, if Group 1 matched, the replacement is a single comma, else, it is an empty string.
CodePudding user response:
You can use multigsub
function which is a wrapper of gsub
function in R. You can find the documentation here.
Here's the code:
multigsub(c("(", ")", ",", "()", ",,", " ,"), c("(", ")", ",", "", ",", ","), x, fixed = T)
CodePudding user response:
You can use mgsub::mgsub
.
a = c("( ", " )", " ,", "()",",,") #pattern
b = c("(", ")", ",", "",",") #replacement
x<-c("Te( )st", "Test()", "Test ()", "Test ( )", "Test ,,", "Test,, ", "Test , ")
mgsub::mgsub(x, a, b, fixed = T)
#[1] "Te()st" "Test" "Test " "Test ()" "Test,," "Test, " "Test, "
You might want to add other patterns to get the output you want.