I have the following dataframe:
time <- c(1,2,3,4,5,6,7,8,9,10,11,12)
threshold <- c(1,1,1,0,0,2,2,1,1,2,2,0)
value <- c(5,3,2,4,6,9,1,10,3,5,2,4)
df <- data.frame(time, threshold, value)
The logic behind the threshold variable is that when there is a 1: store the first value (5) and subtract the number when the threshold changes (4). During the first threshold period of "1", the calculation would be 5 - 4 = 1, which should be stored in a new column at time t=4.
Threshold "0" means no calculation.
Threshold "2" means just the reverse of "1", where the the initial value 9 is subtracted by 10.
The goal would be a table like this:
Is there a way to calculate perform this calculation in R?
CodePudding user response:
If you use the rle
function, you can focus on rows where there is a threshold change. Then, align the values to be subtracted and translate the threshold to a factor controlling the sens of the subtraction.
#Select only rows with a threshold change,
# based on https://stackoverflow.com/a/27482738/14027775
compact_threshold <- rle(df$threshold)
row_ids <- cumsum(c(1, compact_threshold$lengths[-length(compact_threshold$lengths)]))
#Transform thresholds to a factor
# if the row is in a threshold change
df$subFactor <- df$threshold
df$subFactor[-row_ids] <- NA
df$subFactor[df$subFactor==0] <- NA
df$subFactor[df$subFactor==2] <- -1
#Align each value with the one
# corresponding to the next threshold change
df$value2 <- NA
df$value2[row_ids] <- c(df$value[row_ids][-1], NA)
df$calculation <- (df$value-df$value2)*df$subFactor
#Shift results to the next threshold
df$calculation[row_ids] <- c(NA, df$calculation[row_ids][-length(row_ids)])
df <- df[,c("time", "threshold", "value", "calculation")]
df
# time threshold value calculation
#1 1 1 5 NA
#2 2 1 3 NA
#3 3 1 2 NA
#4 4 0 4 1
#5 5 0 6 NA
#6 6 2 9 NA
#7 7 2 1 NA
#8 8 1 10 1
#9 9 1 3 NA
#10 10 2 5 5
#11 11 2 2 NA
#12 12 0 1 -4
Raw data
#Data
time <- c(1,2,3,4,5,6,7,8,9,10,11,12)
threshold <- c(1,1,1,0,0,2,2,1,1,2,2,0)
value <- c(5,3,2,4,6,9,1,10,3,5,2,1)
df <- data.frame(time, threshold, value)
df
# time threshold value
#1 1 1 5
#2 2 1 3
#3 3 1 2
#4 4 0 4
#5 5 0 6
#6 6 2 9
#7 7 2 1
#8 8 1 10
#9 9 1 3
#10 10 2 5
#11 11 2 2
#12 12 0 1