Home > Back-end >  aggregate by date and id variables in R
aggregate by date and id variables in R

Time:09-26

Im strugling to aggregate hourly temperatures into 3-hourly while keeping the station ID.Here is the df:

ID Date temp
1155 2012-01-01 00:00:00 -0.8
1155 2012-01-01 01:00:00 0.1
1155 2012-01-01 02:00:00 0.5

and Im striving to get smth like:

ID Date temp
1155 2012-01-01 -0.2

Ive elaborated this code:

library(dplyr)
  Temp_3h<- df %>%
    group_by(ID)%>%
    aggregate(.,by=list(Date=cut(as.POSIXct(df$Date), "3 hour")),mean)

but beside the "temp" variable it also tend to aggregate IDs (categorical), so they become NAs. And I dont know how to integrate ID into "by=" argument. Any help would be appreciated

CodePudding user response:

You may use floor_date/ceiling_date to combine timestamp every 3 hours into one and take average of temp values for each ID.

library(dplyr)
library(lubridate)

Temp_3h <- df %>%
  group_by(ID, Date = floor_date(ymd_hms(Date), '3 hours')) %>%
  summarise(temp = mean(temp, na.rm = TRUE), .groups = 'drop')

Temp_3h

CodePudding user response:

You could floor the dates and use the group_by and summarize functions:

library(lubridate)
library(dplyr)
library(plyr)
summarise(group_by(df, ID, Date = floor_date(ymd_hms(Date), '3 hours')), first(Date), first(ID), sum(temp))

Output:

  first(Date) first(ID) sum(temp)
1  2012-01-01      1155      -0.2
  • Related