Edit: Unfortunately, I simplified my needs and data too much. I will update the question below.
I have a df similar to the code below. I need to create a new column called first_funding_date
that is equal to the value of fund.date
, where sigma==0
, until the next time sigma==0
. In the example df below, first_fund_date
should be a vector with the first observation equal to "2019/05/22", the following 3 observations equal to "2020/09/05", and the final 4 equal to "2019/11/30".
set.seed(111)
df <- data.frame(id = c(1,1,3,4,5,6,2,7),
fund.date = sample(seq(as.Date('2018/01/01'),
as.Date('2021/01/01'), by="day"), 8),
sigma = c(0,0,1,2,0,1,2,3))
%>% mutate(first_fund_date = ??? )
I also need to create a column called last_funding_date
that is equal to fund.date
, for the rolling max of sigma. The first 4 observations should be "2020/03/03" and the last 4 should be "2020/12/04".
CodePudding user response:
library(dplyr)
df %>%
mutate(first_fund_date = fund.date[sigma==0],
last_funding_date = fund.date[sigma==max(sigma)])
id fund.date sigma first_fund_date last_funding_date
1 1 2019-05-22 1 2020-09-05 2018-03-10
2 2 2020-09-05 0 2020-09-05 2018-03-10
3 3 2018-06-24 1 2020-09-05 2018-03-10
4 4 2020-03-03 2 2020-09-05 2018-03-10
5 5 2019-11-30 3 2020-09-05 2018-03-10
6 6 2018-03-10 4 2020-09-05 2018-03-10
CodePudding user response:
The key here is to to create index variables to group_by
with cumsum(sigma==0)
and cumsum(sigma < lag(sigma))
.
library(dplyr)
df %>%
group_by(index = cumsum(sigma==0))%>%
mutate(first_fund.date = first(fund.date))%>%
group_by(index_2 = cumsum(sigma < lag(sigma, default = Inf)))%>%
mutate(last_fund.date = last(fund.date))%>%
ungroup()%>%
select(-contains('index'))
# A tibble: 8 × 5
id fund.date sigma first_fund.date last_fund.date
<dbl> <date> <dbl> <date> <date>
1 1 2019-05-22 0 2019-05-22 2020-03-03
2 1 2020-09-05 0 2020-09-05 2020-03-03
3 3 2018-06-24 1 2020-09-05 2020-03-03
4 4 2020-03-03 2 2020-09-05 2020-03-03
5 5 2019-11-30 0 2019-11-30 2020-12-04
6 6 2018-03-10 1 2019-11-30 2020-12-04
7 2 2018-11-01 2 2019-11-30 2020-12-04
8 7 2020-12-04 3 2019-11-30 2020-12-04