I have a grouped dataframe with multiple IDs with a date and value column.
id <- c("a", "a", "a", "b", "b", "b", "c")
date <- c("2020-01-01", "2020-01-02", "2020-01-03",
"2020-01-01", "2020-01-02", "2020-01-03",
"2020-01-01")
value <- rnorm(n = length(id))
df <- cbind.data.frame(id, date, value)
However, some IDs have less than 3 dates. I want to "stretch" those IDs and add an NA for the value column for the new dates. In this dataframe, the "c" ID would have two new dates added ("2020-01-02" and "2020-01-03").
CodePudding user response:
Perhaps this approach would suit?
library(tidyverse)
id <- c("a", "a", "a", "b", "b", "b", "c")
date <- c("2020-01-01", "2020-01-02", "2020-01-03",
"2020-01-01", "2020-01-02", "2020-01-03",
"2020-01-01")
value <- rnorm(n = length(id))
df <- cbind.data.frame(id, date, value)
df %>%
right_join(df %>% expand(id, date))
#> Joining, by = c("id", "date")
#> id date value
#> 1 a 2020-01-01 -1.5371474
#> 2 a 2020-01-02 0.9001098
#> 3 a 2020-01-03 0.1523491
#> 4 b 2020-01-01 0.8194577
#> 5 b 2020-01-02 1.2005270
#> 6 b 2020-01-03 0.1158812
#> 7 c 2020-01-01 -0.8676445
#> 8 c 2020-01-02 NA
#> 9 c 2020-01-03 NA
Created on 2022-09-05 by the reprex package (v2.0.1)
CodePudding user response:
You could use complete()
from tidyr
.
library(tidyr)
df %>%
complete(id, date)
# # A tibble: 9 × 3
# id date value
# <chr> <chr> <dbl>
# 1 a 2020-01-01 1.12
# 2 a 2020-01-02 1.58
# 3 a 2020-01-03 1.26
# 4 b 2020-01-01 -2.30
# 5 b 2020-01-02 -1.45
# 6 b 2020-01-03 -0.212
# 7 c 2020-01-01 0.344
# 8 c 2020-01-02 NA
# 9 c 2020-01-03 NA
CodePudding user response:
In base R, by
id
you may merge
with a data frame created out of seq
uences of the date range. First of all you want to use proper date format by doing df$date <- as.Date(df$date)
.
by(df, df$id, \(x)
merge(x,
data.frame(id=el(x$id),
date=do.call(seq.Date, c(as.list(range(df$date)), 'day'))),
all=TRUE)) |>
do.call(what=rbind)
# id date value
# a.1 a 2020-01-01 1.3709584
# a.2 a 2020-01-02 -0.5646982
# a.3 a 2020-01-03 0.3631284
# b.1 b 2020-01-01 0.6328626
# b.2 b 2020-01-02 0.4042683
# b.3 b 2020-01-03 -0.1061245
# c.1 c 2020-01-01 1.5115220
# c.2 c 2020-01-02 NA
# c.3 c 2020-01-03 NA