Home > Software design >  Merge date and time, find the time difference in R
Merge date and time, find the time difference in R

Time:12-19

I have a data set and there are some date and hour attributes. Here is the sample, then I will explain what I want to do;

date1 hour1 date2 hour2 date3 hour3
2014-03-16 00:00:00 16:20:00 2014-03-16 00:00:00 20:20:03 2014-03-16 00:00:00 22:12:34
2014-04-22 00:00:00 10:20:00 2014-04-22 00:00:00 15:20:03 2014-04-22 00:00:00 20:12:34
2015-03-12 00:00:00 16:20:00 2015-03-12 00:00:00 20:20:03 2015-03-12 00:00:00 22:12:34

We know event1 happens before event2 (event1 -> event2 -> event3)

But as you see, in the date attributes, time section is not correct yet we have hour attributes for each. What I want to do; I want to correct them by using hour attributes then find the difference between these two dates and create new attributes that gives the time difference as hours.

Sample for above table;

event2_time
4
5
4

I tried to merge hour to date and create a new attribute like this but it doesn't work.(my goal is actually correct the date value and get rid of the hour attribute)

trainTable <- trainTable %>%
   mutate("newParam" = as.POSIXct(paste(alert_date, alert_hour), format="%Y-%m-%d %H:%M:%S")

I could use some help, thanks in advance.

CodePudding user response:

Try this solution using mapply. It uses strsplit to split the date from the hours.

dat_new <- data.frame( setNames( mapply( function(x,y){
  tmp <- sapply( strsplit( dat[,x], " "), function(z) z[1] );
  list( as.POSIXct( paste(tmp,dat[,y] ) ) ) },
  grep("date", colnames(dat)), grep("hour", colnames(dat)) ), 
  c("a","b","c") ) )

dat_new$b - dat_new$a
Time differences in hours
[1] 4.000833 5.000833 4.000833

Data

dat <- structure(list(date1 = c("2014-03-16 00:00:00", "2014-04-22 00:00:00", 
"2015-03-12 00:00:00"), hour1 = c("16:20:00", "10:20:00", "16:20:00"
), date2 = c("2014-03-16 00:00:00", "2014-04-22 00:00:00", "2015-03-12 00:00:00"
), hour2 = c("20:20:03", "15:20:03", "20:20:03"), date3 = c("2014-03-16 00:00:00", 
"2014-04-22 00:00:00", "2015-03-12 00:00:00"), hour3 = c("22:12:34", 
"20:12:34", "22:12:34")), class = "data.frame", row.names = c(NA, 
-3L))
  • Related