I would like to remove all words from a string that starts with a letter followed by numbers and ending either with semicolon or a space. For example, given the string
x <- "Z1; D49; Pay-What-You-Want; A1; Moods; Weather; Social norms, K20"
The desired output is
Pay-What-You-Want; Moods; Weather; Social norms;
Thank you
CodePudding user response:
So let's make it a "vector o strings" because it's easier to work with such a value than with a single character value.
# if commas should become semicolons then use gsub("
x <- gsub("[,]", ";", "Z1; D49; Pay-What-You-Want; A1; Moods; Weather; Social norms, K20")
# make it a vector
x2 <- trimws(scan(text=x, what="", sep=";"))
#If you want it to be one string (which seems odd but doable:
(x3 <- paste( x2[!grepl("^[[:alpha:]](\\d) ",x2)] , collapse="; ") )
#[1] "Pay-What-You-Want; Moods; Weather; Social norms"
# Or
(x4 <- x2[!grepl("^[[:alpha:]](\\d) ",x2)] )
#[1] "Pay-What-You-Want" "Moods" "Weather" "Social norms"
CodePudding user response:
My understanding of your comments is that you have a character vector, each element of which is a semicolon-delimited (with some commas) string. If that’s right, then using stringr functions within sapply()
:
library(stringr)
sapply(
str_split(x, "(,|;)\\s "),
\(.x) str_c(.x[!str_detect(.x, "^\\w\\d $")], collapse = "; ")
)
# [1] "Pay-What-You-Want; Moods; Weather; Social norms"
Or using base R:
sapply(
strsplit(x, "(,|;)\\s "),
\(.x) paste(.x[!grepl("^\\w\\d $", .x)], collapse = "; ")
)