I have a data.table that looks like this:
tsdata <- data.table(time = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
signal = c(0, 1, 1, 0, 0, 1, 0, 0, 0, 1))
I am trying to fill the gaps between the ones, but only if the gap of zeros is small. So a flexible solution to define the gap would be nice. In this example the gap with zeros shouldn't be bigger than 2.
The result should look like this:
tsdata <- data.table(time = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
signal = c(0, 1, 1, 1, 1, 1, 0, 0, 0, 1))
My real time series data is much bigger than this, so any help is appreciated.
CodePudding user response:
Group by rleid(signal) and then fill in short 0 sequences not at the beginning or end with 1.
tsdata[, signal2 := ifelse(signal[1] == 0 &
.N <= 2 &
time[1] > min(tsdata$time) &
time[.N] < max(tsdata$time), 1, signal),
by = rleid(signal)]
tsdata
giving:
time signal signal2
1: 1 0 0
2: 2 1 1
3: 3 1 1
4: 4 0 1
5: 5 0 1
6: 6 1 1
7: 7 0 0
8: 8 0 0
9: 9 0 0
10: 10 1 1
Updates
Updated several times.