Home > OS >  Dividing table without using split in R
Dividing table without using split in R

Time:07-19

I have a datatable for a time period of 21 days with data measured every 10 seconds which looks like

TimeStamp            ActivePower  CurrentL1  GeneratorRPM  RotorRPM  WindSpeed
2017-03-05 00:00:10  2183.650     1201.0     1673.90       NA        10.60
2017-03-05 00:00:20  2216.200     1224.0     1679.70       NA        11.00
2017-03-05 00:00:30  2176.500     1203.5     NA            16.05     11.90
---
2017-03-25 23:59:40  2024.20      1150.0     1687.00       16.15     10.35
2017-03-25 23:59:50  1959.05      1106.0     1661.15       15.90     8.65
2017-03-26 00:00:00  1820.55      1038.0     1665.70       15.80     9.20

I want to divide it into 30 minute blocks and my colleague said I shouldn't use the split function since the data can also have timestamps where there is no data and that I should manually make a 30 minute interval duration. I have done this so far:

library(data.table)
library(dplyr)
library(tidyr)

datei <- file.choose() 
data_csv <- fread(datei)

datatable1 <- as.data.table(data_csv)
datatable1 <- datatable1[turbine=="UTHA02",]

datatable1[, TimeStamp:=as.POSIXct(get("_time"), tz="UTC")]
setkey(datatable1, TimeStamp)
startdate <- datatable1[1,TimeStamp]
enddate <- datatable1[nrow(datatable1), TimeStamp]
durationForInterval <- 30*60 #in seconds
curr <- startdate
datatable1[TimeStamp >= curr & TimeStamp < curr   durationForInterval]

So I manually made a 30 minute interval duration and got the first interval

      time                    ActivePower CurrentL1  GeneratorRPM   RotorRPM  WindSpeed           
  1:  2017-03-05 00:00:10     2183.65     1201.0     1673.90        NA        10.60 
  2:  2017-03-05 00:00:20     2216.20     1224.0     1679.70        NA        11.00 
  3:  2017-03-05 00:00:30     2176.50     1203.5          NA        16.05     11.90 
  4:  2017-03-05 00:00:40     2267.95     1256.5     1685.85        NA        10.60 
  5:  2017-03-05 00:00:50     2533.15     1408.0     1693.30        16.20     12.40 
 ---                                                                                                      
176:  2017-03-05 00:29:20     2750.35     1531.0     1694.40        16.20     11.45 
177:  2017-03-05 00:29:30     2930.40     1630.5     1668.25        NA        12.65 
178:  2017-03-05 00:29:40     2459.55     1367.0     1680.25        15.90     12.15 
179:  2017-03-05 00:29:50     2713.80     1508.5     1681.15        16.20     12.25 
180:  2017-03-05 00:30:00     2395.20     1333.0     1667.75        16.00     11.75 

But I only could do it for the first interval and I dont know how to do it for the rest. Is there something that I am missing or am I overthinking? Any help is appreciated!

CodePudding user response:

This will create a column interval with a unique value for every 30 minutes.

datatable1[, interval := as.integer(TimeStamp, units = "secs") %/% (60L*30L)]

You could split on that column or use it for grouping operations.

split(datatable1, datatable1$interval) # or split(datatable1, by = "interval")
  •  Tags:  
  • r
  • Related