Here is the sample data , i tried to expand the date from
to
with complete
and seq
, but i am unable to get the expected result. could you please help me with how to use the complete with seq
to expand the dates
data
df <- tibble(id = c('x1', 'x2'),
dates_from = c(as.Date('2022-01-31'), as.Date('2022-10-31')),
dates_to = c(as.Date('2022-08-31'), as.Date('2022-12-31')),
value = c(0,1))
# A tibble: 2 × 4
id dates_from dates_to value
<chr> <date> <date> <dbl>
1 x1 2022-01-31 2022-08-31 0
2 x2 2022-10-31 2022-12-31 1
code
that I tried but isn't workingdf2 <- df %>% complete(id, date=seq(from= .$dates_from, to= .$dates_to,by='1 month'))
expected output
# A tibble: 11 × 3
id dates value
<chr> <date>
1 x1 2022-01-31 0
2 x1 2022-02-28 0
3 x1 2022-03-31 0
4 x1 2022-04-30 0
5 x1 2022-05-31 0
6 x1 2022-06-30 0
7 x1 2022-07-31 0
8 x1 2022-08-31 0
9 x2 2022-10-31 1
10 x2 2022-11-30 1
11 x2 2022-12-31 1
CodePudding user response:
You may use seq.Date
in an lapply
over the rows. I made your data slightly smaller.
lapply(seq_len(nrow(df)), \(i)
with(df[i, ], data.frame(id=paste0('x', i),
date=seq.Date(dates_from, dates_to, 'month'),
value=value))) |> do.call(what=rbind)
# id date value
# 1 x1 2022-01-31 0
# 2 x1 2022-03-03 0
# 3 x1 2022-03-31 0
# 4 x2 2022-10-31 1
# 5 x2 2022-12-01 1
# 6 x2 2022-12-31 1
Data:
df <- structure(list(id = c("x1", "x2"), dates_from = structure(c(19023,
19296), class = "Date"), dates_to = structure(c(19082, 19357), class = "Date"),
value = c(0, 1)), class = "data.frame", row.names = c(NA,
-2L))
CodePudding user response:
Here is a tidyverse
version:
library(dplyr)
library(tidyr)
library(lubridate)
df %>%
group_by(id) %>%
pivot_longer(col = starts_with("dates"),
names_to = "name",
values_to = "Date") %>%
tidyr::complete(Date = full_seq(Date, period = 1)) %>%
group_by(month(Date)) %>%
slice(n()) %>%
ungroup() %>%
fill(value, .direction = "down") %>%
select(1:3)
id Date value
<chr> <date> <dbl>
1 x1 2022-01-31 0
2 x1 2022-02-28 0
3 x1 2022-03-31 0
4 x1 2022-04-30 0
5 x1 2022-05-31 0
6 x1 2022-06-30 0
7 x1 2022-07-31 0
8 x1 2022-08-31 0
9 x2 2022-10-31 1
10 x2 2022-11-30 1
11 x2 2022-12-31 1
CodePudding user response:
We assume that when the question says by month it means year and month. To ensure that end of month is achieved add one so that we are the start of the next month and then use seq.Date and subtract one.
library(dplyr)
df %>%
group_by(id) %>%
summarize(date = seq(dates_from 1, dates_to 1, "month") - 1,
value = value,
.groups = "drop") %>%
ungroup
## # A tibble: 11 x 3
## id date value
## <chr> <date> <dbl>
## 1 x1 2022-01-31 0
## 2 x1 2022-02-28 0
## 3 x1 2022-03-31 0
## 4 x1 2022-04-30 0
## 5 x1 2022-05-31 0
## 6 x1 2022-06-30 0
## 7 x1 2022-07-31 0
## 8 x1 2022-08-31 0
## 9 x2 2022-10-31 1
## 10 x2 2022-11-30 1
## 11 x2 2022-12-31 1
Another possibility is to use yearmon class rather than representing months byy their last date as yearmon directly represents a year and month.
library(dplyr)
library(zoo)
df %>%
group_by(id) %>%
summarize(yearmon = seq(as.yearmon(dates_from), as.yearmon(dates_to), 1/12),
value = value,
.groups = "drop") %>%
ungroup
## # A tibble: 11 x 3
## id yearmon value
## <chr> <yearmon> <dbl>
## 1 x1 Jan 2022 0
## 2 x1 Feb 2022 0
## 3 x1 Mar 2022 0
## 4 x1 Apr 2022 0
## 5 x1 May 2022 0
## 6 x1 Jun 2022 0
## 7 x1 Jul 2022 0
## 8 x1 Aug 2022 0
## 9 x2 Oct 2022 1
## 10 x2 Nov 2022 1
## 11 x2 Dec 2022 1