Home > Software engineering >  Use gsub to remove all elements before and after numeric characters
Use gsub to remove all elements before and after numeric characters

Time:10-20

I'd like to use gsub to remove characters from a filename.

In the example below the desired output is 23

digs = "filepath/23-00.xlsx"

I can remove everything before 23 as follows:

gsub("^\\D ", "",digs)
[1] "23-00.xlsx"

or everything after:

gsub("\\-\\d \\.xlsx$","", digs)
[1] "filepath/23"

How do I do both at the same time?

CodePudding user response:

We could use | (OR) i.e. match characters (.*) till the / or (|), match the - followed by characters (.*), replace with blank ("")

gsub(".*/|-.*", "", digs)
[1] "23"

Or just do parse_number

readr::parse_number(digs)
[1] 23

CodePudding user response:

You can just use a sub like

sub("^\\D (\\d ).*", "\\1", digs)
# => [1] "23"

See the R demo. See the regex demo. Details:

  • ^ - start of string
  • \D - one or more non-digit chars
  • (\d ) - Group 1 (\1 refers to this group value): one or more digits
  • .* - any zero or more chars as many as possible.
  • Related