Home > Net >  calculate value by the day over multiple years with a date column in r
calculate value by the day over multiple years with a date column in r

Time:10-04

Is there a way to calculate max, mean, and average by the day with a date column in r

Example input

   Date         Value
   1999/01/01    1.2
   1999/01/02    2
   2000/01/01    1
   2000/01/02    3
   2001/01/01    2
   2001/01/02    3

Example Output

Day    Min    Max  Mean
01/01  1      2    1.4
01/02  2      3    2.67

CodePudding user response:

We could try

library(dplyr)
df1 %>%
   group_by(Day = format(as.Date(Date, "%Y/%m/%d"), "%m/%d")) %>% 
   summarise(Min = min(Value), Max = max(Value), Mean = mean(Value))

-output

# A tibble: 2 × 4
  Day     Min   Max  Mean
  <chr> <dbl> <dbl> <dbl>
1 01/01     1     2  1.4 
2 01/02     2     3  2.67

If there are multiple columns, use across

df1 %>%
   group_by(Day = format(as.Date(Date, "%Y/%m/%d"), "%m/%d")) %>% 
   summarise(across(Value, list(Min = min, Max = max, Mean = mean)))

data

df1 <- structure(list(Date = c("1999/01/01", "1999/01/02", "2000/01/01", 
"2000/01/02", "2001/01/01", "2001/01/02"), Value = c(1.2, 2, 
1, 3, 2, 3)), class = "data.frame", row.names = c(NA, -6L))

CodePudding user response:

With summary tapply:

with(df, tapply(Value, substr(Date, 6, 10), summary))

output

$`01/01`
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    1.0     1.1     1.2     1.4     1.6     2.0 

$`01/02`
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.000   2.500   3.000   2.667   3.000   3.000 

or with more flexibility, you can build your own function:

with(df1, tapply(Value, substr(Date, 6, 10), 
                 \(x) c(min = min(x), mean = mean(x), max = max(x))))
$`01/01`
 min mean  max 
 1.0  1.4  2.0 

$`01/02`
     min     mean      max 
2.000000 2.666667 3.000000 
  •  Tags:  
  • r
  • Related