Home > Net >  R: Select by date only works for first date
R: Select by date only works for first date

Time:10-28

I am trying to select values by dates (day) from an imported excel sheet.

  1. Convert to R date

    Date <-as.Date(Df$Calendar.day, origin = "1899-12-30")

This yields 24 entries per day with the correct dates. I aggregate the values for each day

Df_agg <- aggregate(.~Date, Df, sum)

This yields the correct dates, and the sum of values for each day, as desired.

Date 
2022-01-01
2022-01-02 
.
.
.
2022-12-31

However, when I try to filter for a certain date, this works only for the first date (2022-01-01). Other specific dates do not work.

Df_agg %>%
  select(Date) %>% 
  filter(Date== as.Date("2022-05-08"))

[1] Date
<0 rows> (or 0-length row.names)

However, larger/smaller than works:

  Df_agg %>%
      select(Date) %>% 
      filter(Date> as.Date("2022-05-08"))

     Date
1   2022-05-08
2   2022-05-09
3   2022-05-10
4   2022-05-11
5   2022-05-12
6   2022-05-13
7   2022-05-14
8   2022-05-15
9   2022-05-16

What can be the problem?

Sample Df_agg:

Date value
2022-01-01 135.2
2022-01-02 130.6
2022-01-03 121.0
dput(Df_aggr) (Example)

structure(list(Date = structure(c(18993, 18994.0416666666, 18995.0416666665, 
18996.0416666665, 18997.0416666664, 18998.0416666664, 18999.0416666663, 
19000.0416666663, 19001.0416666662, 19002.0416666661, 19003.0416666661, 

 class = "Date"), value = c(135.192291666667, 
130.261541666667, 121.005041666667, 121.005041666667, 121.005041666667, 
121.005041666667, 121.771375, 131.027875, 130.261541666667, 121.005041666667, 

121.771375, 131.027875, 130.261541666667, 121.005041666667, 121.005041666667, 
121.005041666667, 121.005041666667, 121.771375, 126.863458333333
)), row.names = c(NA, -365L), class = "data.frame")

CodePudding user response:

I found a solution, using the "Hour.of.Day" column instead. (E.g., 2022-05-08 00:00:00)

Df[["Hour.of.day"]] <- 
  as.POSIXct(Df[["Hour.of.day"]] * (60*60*24)
             , origin="2022-01-01"
             , tz="Europe/Stockholm")
 Df<- Df %>% 
  rename("Hour" = "Hour.of.day")

Date_Input <- '2022-05-08'

Df %>%filter(as.Date(Hour) == as.Date(Date_Input))

This results in values for all 24 hours of 22-05-08. Since it is not possible to use the hours to aggregate, I create an additional column ("dummy") with value 1 for all rows:

dfdate <- aggregate(.~dummy, Df, sum)

Perhaps not the most elegant solution, but it works and has the added advantage of being able to select intervals of several days with different hours.

  • Related