Home > Blockchain >  Incremental Days between dates in R
Incremental Days between dates in R

Time:09-25

I have a long list that needs the days between dates

ClientID <- c("00932", "00932", "00932")
Visit <- c("2018-11-10", "2018-11-20", "2018-11-25")

ClientID    Visit
00932    2018-11-10
00932    2018-11-20
00932    2018-11-25

I need a new column that says

ClientID    Visit    Days
00932    2018-11-10    0
00932    2018-11-20    10
00932    2018-11-25    15

CodePudding user response:

Change the Visit to date class and for each ClientID subtract the Visit with the minimum Visit date.

library(dplyr)

df %>%
  mutate(Visit = as.Date(Visit, '%m-%d-%Y')) %>%
  group_by(ClientID) %>%
  mutate(Days = as.integer(Visit - min(Visit))) %>%
  ungroup

# ClientID Visit       Days
#  <chr>    <date>     <int>
#1 00932    2018-11-10     0
#2 00932    2018-11-20    10
#3 00932    2018-11-25    15

data

ClientID <- c("00932", "00932", "00932")
Visit <- c("11-10-2018", "11-20-2018", "11-25-2018")
df <- data.frame(ClientID, Visit)

CodePudding user response:

As I assume that you need the day span between dates, not between the date and latest date I propose this:

dn <- as.numeric(as.Date(Visit))

The text with dates is consistently converted to date and to number.

dn2 <- c(dn[1], dn[-length(dn)])

We prepare the second vector for subtraction as it will work fast. It will have the order of memebers like this:

df df2

1st 1st

2nd 1st

3rd 2nd

nth nth-1

Days <- dn - dn2

Find the span.

CodePudding user response:

Considering you have multiple ClientID and want to calculate the Days on that level:

library(lubridate)
library(tidyverse)

ClientID <- c("00932", "00932", "00932")
Visit <- c("11-10-2018", "11-20-2018", "11-25-2018")

df <- data.frame(ClientID, Visit)

df %>%
  group_by(ClientID) %>%
  mutate(Visit= mdy(Visit),
         Days= as.numeric(Visit-lag(Visit)))%>%
  ungroup()%>%
  mutate_if(is.numeric, ~replace_na(., 0))

# A tibble: 3 x 3
  ClientID Visit       Days
  <chr>    <date>     <dbl>
1 00932    2018-11-10     0
2 00932    2018-11-20    10
3 00932    2018-11-25     5

Adding another ClientID and two observations to demonstrate it better:

ClientID <- c("00932", "00932", "00932", "00935", "00935")
Visit <- c("11-10-2018", "11-20-2018", "11-25-2018",  "11-20-2019", "11-25-2019")

df <- data.frame(ClientID, Visit)

df %>%
  group_by(ClientID) %>%
  mutate(Visit= mdy(Visit),
         Days= as.numeric(Visit-lag(Visit)))%>%
  ungroup()%>%
  mutate_if(is.numeric, ~replace_na(., 0))

# A tibble: 5 x 3
  ClientID Visit       Days
  <chr>    <date>     <dbl>
1 00932    2018-11-10     0
2 00932    2018-11-20    10
3 00932    2018-11-25     5
4 00935    2019-11-20     0
5 00935    2019-11-25     5
  • Related