I have a datatable for a time period of 21 days with data measured every 10 seconds which looks like
TimeStamp ActivePower CurrentL1 GeneratorRPM RotorRPM WindSpeed
2017-03-05 00:00:10 2183.650 1201.0 1673.90 NA 10.60
2017-03-05 00:00:20 2216.200 1224.0 1679.70 NA 11.00
2017-03-05 00:00:30 2176.500 1203.5 NA 16.05 11.90
---
2017-03-25 23:59:40 2024.20 1150.0 1687.00 16.15 10.35
2017-03-25 23:59:50 1959.05 1106.0 1661.15 15.90 8.65
2017-03-26 00:00:00 1820.55 1038.0 1665.70 15.80 9.20
I want to divide it into 30 minute blocks and my colleague said I shouldn't use the split function since the data can also have timestamps where there is no data and that I should manually make a 30 minute interval duration. I have done this so far:
library(data.table)
library(dplyr)
library(tidyr)
datei <- file.choose()
data_csv <- fread(datei)
datatable1 <- as.data.table(data_csv)
datatable1 <- datatable1[turbine=="UTHA02",]
datatable1[, TimeStamp:=as.POSIXct(get("_time"), tz="UTC")]
setkey(datatable1, TimeStamp)
startdate <- datatable1[1,TimeStamp]
enddate <- datatable1[nrow(datatable1), TimeStamp]
durationForInterval <- 30*60 #in seconds
curr <- startdate
datatable1[TimeStamp >= curr & TimeStamp < curr durationForInterval]
So I manually made a 30 minute interval duration and got the first interval
time ActivePower CurrentL1 GeneratorRPM RotorRPM WindSpeed
1: 2017-03-05 00:00:10 2183.65 1201.0 1673.90 NA 10.60
2: 2017-03-05 00:00:20 2216.20 1224.0 1679.70 NA 11.00
3: 2017-03-05 00:00:30 2176.50 1203.5 NA 16.05 11.90
4: 2017-03-05 00:00:40 2267.95 1256.5 1685.85 NA 10.60
5: 2017-03-05 00:00:50 2533.15 1408.0 1693.30 16.20 12.40
---
176: 2017-03-05 00:29:20 2750.35 1531.0 1694.40 16.20 11.45
177: 2017-03-05 00:29:30 2930.40 1630.5 1668.25 NA 12.65
178: 2017-03-05 00:29:40 2459.55 1367.0 1680.25 15.90 12.15
179: 2017-03-05 00:29:50 2713.80 1508.5 1681.15 16.20 12.25
180: 2017-03-05 00:30:00 2395.20 1333.0 1667.75 16.00 11.75
But I only could do it for the first interval and I dont know how to do it for the rest. Is there something that I am missing or am I overthinking? Any help is appreciated!
CodePudding user response:
This will create a column interval
with a unique value for every 30 minutes.
datatable1[, interval := as.integer(TimeStamp, units = "secs") %/% (60L*30L)]
You could split
on that column or use it for grouping operations.
split(datatable1, datatable1$interval) # or split(datatable1, by = "interval")