I want to manipulate time-series data. However, my R skills are very limited. Here is an output of my code for replication.
My aims:
Firstly, I want to count the number of observations on one day from the "Survey Creation Date" column. Like this ---> ("2/6/2018 14:33", "2/6/2018 16:20", "2/6/2018 18:54", "2/6/2018 20:08", "2/6/2018 22:29") are 5 observations. The next day are 4, etc. So that the function loops.
Additionally, I want to count the days which are observed --> 2/6/2018 up until 2/22/2018 (mdy). Or maybe even create the 2/6/2018 as the number of days since that has passed since 1/1/2018.
How do I do this? I tried to convert it into as.Date, and used as.POSIXct as well, but somehow I am making mistakes and I always receive an error.
structure(list(`Survey Creation Date` = c("2/6/2018 14:33", "2/6/2018 16:20",
"2/6/2018 18:54", "2/6/2018 20:08", "2/6/2018 22:29", "2/7/2018 8:43",
"2/7/2018 10:52", "2/7/2018 12:21", "2/7/2018 14:56", "2/7/2018 16:20"
), `Survey Completion Date` = c("2/6/2018 14:56", "2/6/2018 16:22",
"2/6/2018 18:58", "2/6/2018 20:22", "2/6/2018 22:46", "2/7/2018 8:44",
"2/7/2018 11:23", "2/7/2018 12:26", "2/7/2018 14:58", "2/7/2018 16:21"
), `Since your last survey; how many alcoholic drinks have you had?` = c(0,
3, 0, 0, 0, 0, 0, 0, 0, 0), `I feel comfortable in my current location` = c(88,
81, 88, 89, 95, 94, 62, 82, 63, 80), `I feel stressed` = c(10,
12, 69, 34, 16, 6, 27, 35, 56, 28), `I feel down/depressed` = c(14,
18, 15, 18, 5, 2, 8, 4, 0, 11)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))
CodePudding user response:
Using tidyverse
we could do:
library(tidyverse)
df %>%
mutate(Date = as.Date(`Survey Creation Date`, format = "%m/%d/%Y")) %>%
group_by(Date) %>%
count() %>%
mutate(Days_since_Jan_1 = Date - as.Date("2018-01-01"))
#> # A tibble: 2 x 3
#> # Groups: Date [2]
#> Date n Days_since_Jan_1
#> <date> <int> <drtn>
#> 1 2018-02-06 5 36 days
#> 2 2018-02-07 5 37 days
Created on 2022-05-08 by the reprex package (v2.0.1)