Home > Back-end >  Count and aggregate by date [duplicate]
Count and aggregate by date [duplicate]

Time:09-26

The dataset:

Date
2021-09-25T17:07:24.222Z    
2021-09-25T16:17:20.376Z
2021-09-24T09:30:53.013Z
2021-09-24T09:06:24.565Z

I would like to count the number of rows per day. For example, 2021-09-25 will be 2.

To solve said challenge I looked at the following post:

The answer of Rorshach is the solution. However, I do not understand how I can format my rows in the Date column to 2021/09/24 instead of 2021-09-24T09:06:24.565Z.

Could someone explain to me how to format the entries in the Date column?

CodePudding user response:

After converting the date you may use table to count occurrence of each Date.

table(as.Date(df$Date))

#2021-09-24 2021-09-25 
#         2          2 

CodePudding user response:

Parse the string into a datetime object and then extract the date (without the hours and minutes) to be able to count:

library(dplyr)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

tibble::tribble(
  ~Date,
  "2021-09-25T17:07:24.222Z",
  "2021-09-25T16:17:20.376Z",
  "2021-09-24T09:30:53.013Z",
  "2021-09-24T09:06:24.565Z"
) %>%
  mutate(
    day = Date %>% parse_datetime() %>% as.Date()
  ) %>%
  count(day)
#> # A tibble: 2 × 2
#>   day            n
#>   <date>     <int>
#> 1 2021-09-24     2
#> 2 2021-09-25     2

CodePudding user response:

@RonakShah's answer is good, but to have the dataframe in better format, use the count function from the plyr library:

library(plyr)
count(as.Date(df$Date))

Output:

           x freq
1 2021-09-24    2
2 2021-09-25    2
  • Related