I have a file with name "test_result_20210930.xlsx". I would like to get "20210930" out to a new variable date
. How should I do that? I think I can say pattern="[0-9] "
What if I have more numbers in the file name, and I only want the part that will stand for the date? (8digt together?)
Any suggestion?
CodePudding user response:
Using gsub
with \\D
matches all non-digits and in the replacement, specify blank (""
)
gsub("\\D ", "", str1)
[1] "20210930"
If the pattern also includes other digits, and want to return only the 8 digits
sub(".*_(\\d{8})_.*", "\\1", "test_result_20210930_01.xlsx")
[1] "20210930"
Or use str_extract
library(stringr)
str_extract("test_result_20210930_01.xlsx", "(?<=_)\\d{8}(?=_)")
[1] "20210930"
If we need to automatically convert to Date
object
library(parsedate)
parse_date(str1)
[1] "2021-09-30 UTC"
-output
str1 <- "test_result_20210930.xlsx"
CodePudding user response:
You can also use str_extract
from the stringr
package to obtain the desired result.
library(stringr)
str_extract("test_result_20210930.xlsx", "[0-9]{8}")
# [1] "20210930"