I need to calculate the number of days that each person in a dataset spends within a fiscal quarter. Here's a dataframe with 4 hypothetical people:
id <- c('1', '2', '3', '4')
end_date <- c("2009-05-24", "2002-02-04", "2015-09-23", "2011-12-04")
start_date <- c("2004-07-24", "1992-07-04", "2011-03-23", "2001-07-04")
df <- data.frame(id, start = ymd(start_date), end = ymd(end_date))
I can easily calculate their total follow-up per person and overall:
> df %>% mutate(fu_time = end - start)
id start end fu_time
1 1 2004-07-24 2009-05-24 1765 days
2 2 1992-07-04 2002-02-04 3502 days
3 3 2011-03-23 2015-09-23 1645 days
4 4 2001-07-04 2011-12-04 3805 days
> df %>% mutate(fu_time = end - start) %>% summarize(total = sum(fu_time))
total
1 10717 days
UPDATE: GETTING CLOSER - I think I'm on to something, having weird errors though
I wrote the following function, which could calculate how many days within Q1 a patient spent:
q1fun <- function(x,y) {
sum(month(seq(x, y, by = "days")) %in% 1:3)
}
Basically, it expands the sequence, counts the number of months, then returns that value. So for instance:
> q1fun(ymd("2004-07-24"), ymd("2009-05-24"))
[1] 451
The problem is that it won't work in mutate! I'm sure I'm doing something wrong; if someone could help with this last step I'll have it!
df %>%
mutate(q1 = q1fun(start, end))
Error: Problem with `mutate()` input `q1`.
x 'from' must be of length 1
i Input `q1` is `q1fun(start, end)`.
Run `rlang::last_error()` to see where the error occurred.
CodePudding user response:
Simply group the data by the quarter of the calendar year:
df2<-df %>% mutate(fu_time = end - start, quarter=lubridate::quarter(end_date)) %>%
group_by(quarter) %>% summarise(fu_time=sum(fu_time))
barplot(df2$quarter,as.numeric(df2$fu_time))
CodePudding user response:
Ok so I figured it out, needed to group rowwise because I don't have a vectorized function.
So here is the final functions and what it looks like when run. Hope this helps someone else out!
> # Calculate quarter 1/2/3/4 times
> q1fun <- function(x,y) {
sum(month(seq(x, y, by = "days")) %in% 1:3)
}
> q2fun <- function(x,y) {
sum(month(seq(x, y, by = "days")) %in% 4:6)
}
> q3fun <- function(x,y) {
sum(month(seq(x, y, by = "days")) %in% 7:9)
}
> q4fun <- function(x,y) {
sum(month(seq(x, y, by = "days")) %in% 10:12)
}
>
> df %>%
rowwise %>%
mutate(q1 = q1fun(start, end),
q2 = q2fun(start, end),
q3 = q3fun(start, end),
q4 = q4fun(start,end))
# A tibble: 4 x 7
# Rowwise:
id start end q1 q2 q3 q4
<chr> <date> <date> <int> <int> <int> <int>
1 1 2004-07-24 2009-05-24 451 418 437 460
2 2 1992-07-04 2002-02-04 847 819 917 920
3 3 2011-03-23 2015-09-23 370 455 453 368
4 4 2001-07-04 2011-12-04 902 910 1009 985
The end result is a dataframe with how many days each patient spent within each quarter.