I need a regex for str_remove()
that removes all characters after and including " - "
but only for strings that do not start with a number. For example, it should turn
"d_19 - blah"
into
"d_19"
but leave
"1 - blah"
unaffected
CodePudding user response:
Try this one.
gsub('^\\D.*\\K\\s\\-.*', '', x, perl=TRUE)
# [1] "d_19" "1 - blah"
CodePudding user response:
In base R:
sub("^(\\D.*) -.*", "\\1", string)
[1] "d_19" "1 - blah"
Using perl in base R
sub("^\\D.*\\K -.*", "", string, perl=TRUE)
[1] "d_19" "1 - blah"
using str_replace
str_replace(string, "^(\\D.*) -.*", "\\1")
[1] "d_19" "1 - blah"
CodePudding user response:
For me it is somethimes easier to read if the regex is not too complex, so I simplified the task by excluding the untouchable elements first
x <- c("d_19 - blah", "1 - blah")
# which elements start with a number
x_num <- !grepl("^[0-9]", x)
# remove everything starting from the dash from all other elements
x[x_num] <- trimws(stringr::str_remove(x[x_num], "-. "))
x
#> [1] "d_19" "1 - blah"