Home > front end >  Extract dates from R script
Extract dates from R script

Time:06-28

I have a long R script that I need to extract many dates from. I'm treating the script as a text file, and am trying to find the dates with regex look around functions.

Here is an example of a chunk of code containing dates:

Chamber1_3 <- subset(Chamber1Exp, Chamber1Exp$RealTime>=as.POSIXct("2019-05-21 06:01:45") & 
Chamber1Exp$RealTime<= as.POSIXct("2019-05-21 06:23:58"))

plot(Chamber1_3$RealTime[Chamber1_3$Status=="PRE"],Chamber1_3$N2O_ppm[Chamber1_3$Status=="PRE"],
xlim=as.POSIXct(c("2019-05-21 06:01:45", "2019-05-21 06:23:58")), 
ylim=c(.34,.35))

I want to recover "2019-05-21 06:01:45" and ""2019-05-21 06:23:58". They are repeated twice in the code, I just want them once.

I'm testing RegEx snippets in the RegExplain add-in to RStudio. I was trying to use a look-around function to capture the dates following the common text 'as.POSIXct('

I thought this should work but I get nothing.

(?<=Xct\(\")(?=\")

Suggestions?

CodePudding user response:

We can extract all the 4 numbers- 2 numbers- 2 numbers one or more spaces 2 numbers:2 numbers: 2numbers, then unlist and take only the unique values.

your_vec <- c('Chamber1_3 <- subset(Chamber1Exp, Chamber1Exp$RealTime>=as.POSIXct("2019-05-21 06:01:45") & 
Chamber1Exp$RealTime<= as.POSIXct("2019-05-21 06:23:58"))

plot(Chamber1_3$RealTime[Chamber1_3$Status=="PRE"],Chamber1_3$N2O_ppm[Chamber1_3$Status=="PRE"],
xlim=as.POSIXct(c("2019-05-21 06:01:45", "2019-05-21 06:23:58")), 
ylim=c(.34,.35))')

unique(unlist(str_extract_all(your_vec, '[0-9]{4}-[0-9]{2}-[0-9]{2}\\s [0-9]{2}:[0-9]{2}:[0-9]{2}')))

leaving:
[1] "2019-05-21 06:01:45"
[2] "2019-05-21 06:23:58"
  • Related