I got a data formatted file that contains date variables, which are formatted as, for example, 9/28/2017 16:00. I want to turn these variables into continuous numeric variables like 16 in this case (Others like 9/28/2017 17:00 to 17 and 9/30/2017 17:00 to 17 as well). I only want the times no matter what date it is. How could I do that in R?
CodePudding user response:
One way of extracting the hour
part of the data is to use gsub
:
y <- as.POSIXct(c("9/28/2017 16:00", "9/28/2017 17:00"), format = "%m/%d/%Y %H:%M")
y
#[1] "2017-09-28 16:00:00 07" "2017-09-28 17:00:00 07"
#Here 7 is my timezone.
as.numeric(gsub("(.*?)\\s|\\:00*", "",y))
#[1] 16 17
It also works in case the "date" has character type.
z <- "9/28/2017 16:00"
str(z)
#chr "9/28/2017 16:00"
as.numeric(gsub("(.*?)\\s|\\:00*", "",z))
#[1] 16
CodePudding user response:
I find it easiest to use the lubridate
package. Here is a tidyverse
solution; the basic idea is to first change the date elements to a date-time type (I do this with lubridate::mdy_hms
), and then from there, use the lubridate::hour
function to extract the hour that you want. You could also use the hms::as_hms
function in lieu of the lubridate::hour
function if you needed more information about the time.
library(tidyverse)
library(lubridate)
## creating an example tibble (dataframe) with two example dates
example_df <- tribble(~date, "9/28/2017 16:00", "9/28/2017 17:00")
## changing the date column to an date_time first, then extract hour from there in two different ways
## the first way uses lubridate::hour the second uses hms::as_hms
example_df %>%
mutate(date = mdy_hm(date)) %>%
mutate(hour = hour(date)) %>%
mutate(hour_minute = hms::as_hms(date))