Home > Software engineering >  Extract characters after the last appearance of a certain symbol in a vector
Extract characters after the last appearance of a certain symbol in a vector

Time:05-02

I have a number of strings like

x <- c("1.22.33.444","11.22.333.4","1e.3e.3444.45", "g.78.in.89")

i would like to extract only those characters/digits appearing after the last ".". As far as i have looked through the data, there are exactly 4 segments in each of those strings, each separated by a ".", so three "." in each string.

Desired outcome

> x
[1] "444" "4"   "45"  "89" 

CodePudding user response:

A possible solution, using stringr::str_extract:

  • $ means the end of the string.
  • \\d means one or more numeric digit.
  • (?<=\\.) looks behind, to check whether behind the numeric digit there is a dot.

You can learn more at: Lookahead and lookbehind regex tutorial

library(stringr)

x <- c("1.22.33.444","11.22.333.4","1e.3e.3444.45", "g.78.in.89")

str_extract(x, "(?<=\\.)\\d $")

#> [1] "444" "4"   "45"  "89"

CodePudding user response:

We could use trimws from base R

 trimws(x, whitespace = ".*\\.")
[1] "444" "4"   "45"  "89" 
  •  Tags:  
  • r
  • Related