I’m looking to calculate the gaps between a previous "end" number, with the next "start" number. Referring to the data attached, as an example, the result is in df$gap. In the first row, the number is df$gap=df$start[1]-1. the rest of result would be df$start[n]-df$end[n-1]. I can easily do this in Excel, however, I am having difficulty with figuring out how to do this in R without loop.
If anyone could provide a solution, that would be much appreciated!
df = read.table(text="start end
172 635
766 1699
1817 1891
2015 2320", header=T)
the expected result:
start end gap
172 635 171
766 1699 131
1817 1891 118
2015 2320 124
CodePudding user response:
Using dplyr
this is a solution using lag
df %>% mutate(gap = start - lag(end))%>%
mutate(gap = ifelse(row_number() == 1,start -1,gap))
Output:
start end gap
1 172 635 171
2 766 1699 131
3 1817 1891 118
4 2015 2320 124
CodePudding user response:
In base R:
df$gap <- df$start - c(1L, head(df$end, -1))
Gives:
df
start end gap
1 172 635 171
2 766 1699 131
3 1817 1891 118
4 2015 2320 124
CodePudding user response:
dplyr
plus a small trick could help with that:
library(dplyr)
df = read.table(text="start end
172 635
766 1699
1817 1891
2015 2320", header=T)
df$temp <- c(1, df$end[-length(df$end)])
mutate(df, gap = start - temp) |> select(-temp)
Output:
start end gap
1 172 635 171
2 766 1699 131
3 1817 1891 118
4 2015 2320 124
CodePudding user response:
One possible solution with the package data.table
Please find the reprex below.
REPREX
library(data.table)
DT <- setDT(df)
DT[, end_lead := shift(end,1)][, `:=` (gap = start - end_lead, end_lead = NULL)]
setnafill(DT, fill = DT$start[1] - 1)
DT
#> start end gap
#> 1: 172 635 171
#> 2: 766 1699 131
#> 3: 1817 1891 118
#> 4: 2015 2320 124
Created on 2021-10-13 by the reprex package (v0.3.0)
CodePudding user response:
If I get your question, one solution could be lag
function from dplyr
For istance:
df[,'gap'] = df[,'start'] - lag(df[,"end"], n = 1)