Home > database >  calculate difference based on id and date
calculate difference based on id and date

Time:09-30

I have a data set that have id date and time, now I want to calculate the difference between each available date based on id. I have try to look for similar problem in stack overflow but so far no luck. I have try a few different syntax but still no luck at the moment. any help would be great.

data set:

> dput(mydata)
structure(list(id = c("a", "a", "b", "b", "b", "c"), date = c("2018-04-13", 
"2011-11-12", "2019-05-30", "2014-09-13", "2019-06-21", "1998-01-08"
), time = c("50", "40", "30", "20", "10", "30")), class = "data.frame", row.names = c(NA, 
-6L))

Desire output:

id     date         time    time_diff
a   2018-04-13       50       10
a   2011-11-12       40       NA/0
b   2019-05-30       30        10
b   2014-09-13       20       NA/0
b   2019-06-21       10       -20
c   1998-01-08       30       NA/0

I understand the earliest date won't have anything to calculate the difference so it can be either NA or 0 in this case.

Here is the code that I have try but getting error:

mydata <- mydata %>%
group_by(id,date) %>%
mutate(time_diff = diff(time))

CodePudding user response:

library(dplyr)

df <- structure(list(
    id = c("a", "a", "b", "b", "b", "c"),
    date = ("2018-04-13", "2011-11-12", "2019-05-30", "2014-09-13", "2019-06-21", "1998-01-08"),
    time = c("50", "40", "30", "20", "10", "30")),
  class = "data.frame", row.names = c(NA, -6L))

df %>%
  group_by(id) %>%
  arrange(id, date) %>%
  mutate(
    time = as.numeric(time),
    time_diff = time - lag(time)
  )

CodePudding user response:

For each id you may subtract the time corresponding to minimum date.

library(dplyr)

mydata %>%
  mutate(time = as.numeric(time), 
         date = as.Date(date)) %>%
  group_by(id) %>%
  mutate(time_diff = time - time[which.min(date)]) %>%
  ungroup

#  id    date        time time_diff
#  <chr> <date>     <dbl>     <dbl>
#1 a     2018-04-13    50        10
#2 a     2011-11-12    40         0
#3 b     2019-05-30    30        10
#4 b     2014-09-13    20         0
#5 b     2019-06-21    10       -10
#6 c     1998-01-08    30         0
  •  Tags:  
  • r
  • Related