Home > Enterprise >  Obtain a variable mean for a time period
Obtain a variable mean for a time period

Time:10-13

I am blocked in my analysis

I have a dataset like this one:

Date Light
2019-02-15 01:00:00 0.134
2019-02-15 02:00:00 0.345
2019-02-15 03:00:00 0.567
2019-02-15 04:00:00 0
2019-02-15 05:00:00 0.100
... ...
2019-03-10 15:00:00 0.890
... ...

It's a big dataset with lots of dates. And I'm trying to get the light averages for different periods of time: daily average; 5-day average; 1-week average (last 4 days today ; the last 6 days today and the daily average)

the results i would like to have are like this dataset :

Date Light timeperiod_5days timeperiod_1week timeperiod_24h

I tried to use differents function : timeAverage; xts. but I can't get any results, I don't know if they are suitable for what I want.

Someone know how i can obtain my results ?

Thanks in Advance !

CodePudding user response:

I am not sure to understand what result you expect. Do you have a part or the totality of your code that you can share ?

I have something that can be a start maybe.

data = read.csv(file = "data",header = T,sep = ",")
data$Date = as.POSIXct(data$Date , format = "%Y-%m-%d %H:%M:%S" ) #to make date as date in R 
data$Day = format(data$Date,"%Y-%m-%d") #to extract only the day

# Aggregation
tab_A  = aggregate(x=data[,-c(1,2,3)], by=list(A=data$Day), FUN=mean , na.rm = T) # Whith that you can get the daily mean of your data base

If you find a way to describe the 'week' in R by the week number for example yo can aggregate by week for each year.

As for the last 4 days today you could go simple like :

mean(data$Light[(length(data$Light)-5):length(data$Light)])

CodePudding user response:

Not sure if "24h" means "today" or "last 24 hours", in the first case your final code would look something like this:

library(dplyr)
library(lubridate)

rollmean <- function(i,window){
  startdate <- as.Date(df$Date[i])-days(window-1)
  enddate <- as.Date(df$Date[i]) 1
  
  tmp <- df %>% filter(between(as.Date(Date), startdate, enddate))
  return(mean(tmp$Light))
}

for (i in 1:nrow(df)) {
  df[i, "timeperiod_24h"] <- rollmean(i, 1)
  df[i, "timeperiod_5d"] <- rollmean(i, 5)
  df[i, "timeperiod_7d"] <- rollmean(i, 7)
}

Which results in:

                 Date Light timeperiod_24h timeperiod_5d timeperiod_7d
1 2019-02-15 01:00:00 0.134         0.2292        0.2292        0.2292
2 2019-02-15 02:00:00 0.345         0.2292        0.2292        0.2292
3 2019-02-15 03:00:00 0.567         0.2292        0.2292        0.2292
4 2019-02-15 04:00:00 0.000         0.2292        0.2292        0.2292
5 2019-02-15 05:00:00 0.100         0.2292        0.2292        0.2292
6 2019-02-17 05:00:00 0.300         0.3000        0.2410        0.2410
7 2019-03-10 15:00:00 0.890         0.8900        0.8900        0.8900

If you instead want "last 24 hours" you drop the as.Dates in the function and change the rounding a bit.

  • Related