I am trying to write a generalize aggregation function where the user specifies the aggregation level or they can aggregate the data over all study dates. The floor_date
only converts the first date. why? How can I fix this?
library(dplyr)
library(lubridate)
sTerm <- "year" # month, bimonth, quarter, season, halfyear and year, custom
sCustom <- "2023-2025"
dfDatasetOutput <- data.frame(
valDate=seq(as.Date("2023-01-01"), as.Date("2025-12-01"), by = "month"),
cat1=rnorm(36, 3500, 1000),
cat2=rnorm(36, 2.5, 5)
)
dfDatasetOutput %>%
mutate(valDate=ifelse(toupper(sTerm)=="CUSTOM",
sCustom,
as.character(floor_date(valDate, sTerm))))
# this works just fine
dfDatasetOutput %>%
mutate(valDate=as.character(floor_date(valDate, sTerm)))
CodePudding user response:
The problem does not stem from floor_date
but from your use of ifelse
. As per its manual:
ifelse(test, yes, no)
ifelse returns a value with the same shape as test which is filled
with elements selected from either yes or no depending on whether
the element of test is TRUE or FALSE.
Your test is toupper(sTerm)=="CUSTOM"
which is a single logical element TRUE or FALSE (or NA). So the output of ifelse
will be a single element. If the test is false, it will take this element from as.character(floor_date(valDate, sTerm))
. It only needs one, so will take the first one. Then mutate
recycles this single value to the length of the column.
If you want the output to be the same length as valDate
, a workaround would be to repeat your test so you get a vector of the desired length as a test:
dfDatasetOutput %>%
mutate(valDate=ifelse(rep(toupper(sTerm)=="CUSTOM",nrow(dfDatasetOutput)),
sCustom,
as.character(floor_date(valDate, sTerm))))
To avoid such unintended use of ifelse
, consider using if_else
which runs checks on object lengths.