I’m trying to extract “A35-9B004” out of “A35-9B004-65g3h” using the function sub in R but keep failing. I’ve tried using regular expressions but can´t seem to figure out how to handle the double “-“ in the string, and can only extract first or last segment in the string.
Thank you!
x<-"A35-9B004-65g3h"
sub(".*-", "",x)
[1] "65g3h"
sub("*-.*", "", x)
[1] "A35"
CodePudding user response:
We could use the pattern to match the -
followed by one or more characters that are not a -
([^-]
) till the end ($)
of the string and replace with blank (""
)
sub("-[^-] $", "", x)
[1] "A35-9B004"
Or use trimws
with whitespace
that takes a regex
trimws(x, whitespace = "-[^-] ")
[1] "A35-9B004"