Is there a way to calculate max, mean, and average by the day with a date column in r
Example input
Date Value
1999/01/01 1.2
1999/01/02 2
2000/01/01 1
2000/01/02 3
2001/01/01 2
2001/01/02 3
Example Output
Day Min Max Mean
01/01 1 2 1.4
01/02 2 3 2.67
CodePudding user response:
We could try
library(dplyr)
df1 %>%
group_by(Day = format(as.Date(Date, "%Y/%m/%d"), "%m/%d")) %>%
summarise(Min = min(Value), Max = max(Value), Mean = mean(Value))
-output
# A tibble: 2 × 4
Day Min Max Mean
<chr> <dbl> <dbl> <dbl>
1 01/01 1 2 1.4
2 01/02 2 3 2.67
If there are multiple columns, use across
df1 %>%
group_by(Day = format(as.Date(Date, "%Y/%m/%d"), "%m/%d")) %>%
summarise(across(Value, list(Min = min, Max = max, Mean = mean)))
data
df1 <- structure(list(Date = c("1999/01/01", "1999/01/02", "2000/01/01",
"2000/01/02", "2001/01/01", "2001/01/02"), Value = c(1.2, 2,
1, 3, 2, 3)), class = "data.frame", row.names = c(NA, -6L))
CodePudding user response:
With summary
tapply
:
with(df, tapply(Value, substr(Date, 6, 10), summary))
output
$`01/01`
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 1.1 1.2 1.4 1.6 2.0
$`01/02`
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.000 2.500 3.000 2.667 3.000 3.000
or with more flexibility, you can build your own function:
with(df1, tapply(Value, substr(Date, 6, 10),
\(x) c(min = min(x), mean = mean(x), max = max(x))))
$`01/01`
min mean max
1.0 1.4 2.0
$`01/02`
min mean max
2.000000 2.666667 3.000000