I have to extract a specific date from a bunch of filenames. I have found that following code can help with it:
dates <- unique(gsub(pattern = "xxxxxx", replacement = "xxxx", x = filenames))
Example file name: LC08_L1TP_211048_20180705_20180717_01_T1_2018-07-05_B5.TIF
Date to extract: 20180705
Can anyone please tell me what to fill in for pattern and replacement in the above code.
CodePudding user response:
If the underscores are in the same place, then this will be enough:
unlist(strsplit(str, "_"))[4]
CodePudding user response:
Assuming you want the 4th underscore-separated field try these. They also work if x is a vector.
1) This uses read.table.
x <- "LC08_L1TP_211048_20180705_20180717_01_T1_2018-07-05_B5.TIF"
read.table(text = x, sep = "_")[[4]]
## [1] 20180705
2) or using sub
and a regular expression use this which also works with vector x:
sub("^([[:alnum:]] _){3}(\\d )_.*", "\\2", x)
## [1] "20180705"
3) If the date always appears in character positions 18 through 25 then:
substring(x, 18, 25)
## [1] "20180705"
4) If instead of the above assumption, the assumption is that we want the first occurrence of 8 digits following an underscore then:
sub("^.*?_(\\d{8}).*", "\\1", x)
## [1] "20180705"