I need to create two new columns which rely on each other for their values. It's about the start and end times of rows. So the first start value is always 0, and its end value would be 0 duration_secs. Then the next rows start would be the last rows end. And that rows end would be its start duration_secs.
The only data that I have for this is the column Duration_secs.
data$Duration_secs <- c(26, 300, 300, 300, 300, 300, 300, 300, 247, 300, 300, 300, 300)
This is an example, as the real data would contain about 300 rows.
I've tried:
rowShift <- function(x, shiftLen = 1L) {
r <- (1L shiftLen):(length(x) shiftLen)
r[r<1] <- NA
return(x[r])
}
data$Start <- 0
data$End <- (data$Start data$Duration_Secs)
data$Start <- rowShift(data$End, -1)
data$End <- (data$Start data$Duration_Secs)
But it only gives me the output for the first two rows. I need it to look like this:
Start | End | Duration_secs |
---|---|---|
0 | 26 | 26 |
26 | 326 | 300 |
326 | 626 | 300 |
626 | 926 | 300 |
926 | 1226 | 300 |
etcetera.
Currently I've been doing it manually in Excel, by just creating the function for each cell like displayed here below
Start | End | Duration_secs |
---|---|---|
0 | A1 A3 |
26 |
A2 |
B1 B3 |
300 |
B2 |
C1 C3 |
300 |
C2 |
D1 D3 |
300 |
D2 |
E1 E3 |
300 |
But I need to create a script to do it for me, as in the future I'll have to do it hundreds of times. I've also puzzled a bit with dplyr, but I wasn't advanced enough myself to figure out the code.
How can I create a script in R that will use the previous row for each new value in both columns?
CodePudding user response:
If you only have "Duration_secs" you could do
library(dplyr)
data.frame <- data.frame(Duration_secs=c(26,300,300,300))
new <- data.frame %>%
mutate(Start = c(0,cumsum(Duration_secs)[-n()]),
End=cumsum(Duration_secs))
Output:
Duration_secs Start End
1 26 0 26
2 300 26 326
3 300 326 626
4 300 626 926
If you want to keep the order given in your desired output you could either do
new <- data.frame %>%
mutate(Start = c(0,cumsum(Duration_secs)[-n()]),
End=cumsum(Duration_secs)) %>%
relocate(Duration_secs,.after=last_col())
or
new <- data.frame %>%
mutate(Start = c(0,cumsum(Duration_secs)[-n()]),
End=cumsum(Duration_secs)) %>%
select(Start,End,Duration_secs)