I want to extract the numbers after the 1st underscore (_)
, but I don't know why just only 1 number digit is selected.
My sample data is:
myvec<-c("increa_0_1-1","increa_9_25-112","increa_25-50-76" )
as.numeric(gsub("(.*_){1}(\\d)_. ", "\\2", myvec))
[1] 0 9 NA
Warning message:
NAs introduced by coercion
I'd like:
[1] 0 9 25
Please, any help with it?
CodePudding user response:
library(stringr)
str_extract(myvec, "(?<=_)[0-9] ")
[1] "0" "9" "25"
CodePudding user response:
You can use sub
(because you will need a single search and replace operation) with a pattern like ^[^_]*_(\d ).*
:
myvec<-c("increa_0_1-1","increa_9_25-112","increa_25-50-76" )
sub("^[^_]*_(\\d ).*", "\\1", myvec)
# => [1] "0" "9" "25"
See the R demo and the regex demo.
Regex details:
^
- start of string[^_]*
- a negated character class that matches any zero or more chars other than_
_
- a_
char(\d )
- Group 1 (\1
refers to the value captured into this group from the replacement pattern): one or more digits.*
- the rest of the string (.
in TRE regex matches line break chars by default).
CodePudding user response:
myvec<-c("increa_0_1-1","increa_9_25-112","increa_25-50-76" )
as.numeric(gsub("[^_]*_(\\d ).*", "\\1", myvec))
[1] 0 9 25
CodePudding user response:
Another possible solution, based on stringr::str_extract
:
library(stringr)
myvec<-c("increa_0_1-1","increa_9_25-112","increa_25-50-76" )
as.numeric(str_extract(myvec, "(?<=_)\\d "))
#> [1] 0 9 25