Home > other >  Converting "character" to "POSIX" in R
Converting "character" to "POSIX" in R

Time:04-23

I'm new to R and trying to replicate my current excel reports in R. I have a multi-column csv file with a column containing a timestamp in the format of "4/18/2022, 2:15 PM" (copied directly from the datafile). Ultimately I need to extract the week number and day number for each row in the csv.

I've gotten this far in converting the char to POSIX.

datetest = ("4/18/2022, 12:53:22 AM")
datetest1 <- mdy_hms(datetest)
print(datetest1)
sapply(datetest1, class)

which produces

> datetest = ("4/18/2022, 12:53:22 AM")
> datetest1 <- mdy_hms(datetest)
> print(datetest1)
[1] "2022-04-18 00:53:22 UTC"
> sapply(datetest1, class)
     [,1]     
[1,] "POSIXct"
[2,] "POSIXt" 

But my problem is when I try to convert the entire column

#import CSV
RawData <- read.csv("data.csv",header = TRUE, ",")
#convert TS to POSIX
#column header is Date.Time.Opened
escRawData[['Date.Time.Opened']] <- mdy_hms(escRawData[['Date.Time.Opened']])

which gives me this

> escRawData <-read.csv("report1650637729324.csv",header = TRUE, ",")
> #convert TS to POSIX
> escRawData[['Date.Time.Opened']] <- mdy_hms(escRawData[['Date.Time.Opened']])
Warning message:
All formats failed to parse. No formats found. 

which has cleared all the data from the dataframe. I used sapply to check the datatype in the CSV and I know that the data is "character."

I've read/watched a number of tutorials describing lubridate and the POSIXlt commands all giving more or less the same error of "no formats found" so obviously, I am missing something basic. Any help would be appreciated.

CodePudding user response:

With your current code, one faulty date entry (format-wise or other) breaks the whole column's conversion. You could inspect the raw data beforehand, or transform them rowwise and check for failed conversions (NAs) like:

library(dplyr)
library(lubridate)

converted_data <-
    escRawData %>%
      rowwise %>% ## important
      mutate(date_converted = mdy_hms(Date.Time.Opened))

## which rows went wrong?
converted_data %>%
    filter(is.na(date_converted))

CodePudding user response:

Ok Everyone, I found out what I was doing wrong. @KU99's comment helped me figure it out. My actual data file did not include seconds in the time stamp. When I changed to using

escRawData[['Date.Time.Opened']] <- mdy_hm(escRawData[['Date.Time.Opened']])

it worked perfectly.

  • Related