Home > Software engineering >  How to 1 to date values in a column based on conditions?
How to 1 to date values in a column based on conditions?

Time:03-01

I'm a beginner in R.

Currently, I'm cleaning a dataset that revolves around flight data and realized an issue where arrival date/times for certain rows are earlier than departure times. I want to add 1 days to ArrDate(s) that are earlier than the DepDate(s). Below is the sample of my DataFrame with the DepDate and ArrDate columns.

DepDate               ArrDate
<S3: POSIXct>         <S3: POSIXct>
2006-01-11 22:56:00   2006-01-11 06:55:01
2006-01-11 23:47:00   2006-01-11 06:57:01

I have attempted using this code but it does not seem to apply to the actual DataFrame.

df$ArrDate[df$ArrDate < df$DepDate] <- df$ArrDate[df$ArrDate < df$DepDate]   1

Can anyone assist with this? Thank you!

CodePudding user response:

library(lubridate)
library(data.table)

function_addOneDay = function(x) {
  
  x <- as.POSIXct(x,format = "%Y-%m-%d %H:%M:%OS", tz = "UTC")
  x <- x   days(1)
  
}


df <- fread(
"DepDate|ArrDate
2006-01-11 22:56:00|2006-01-11 06:55:01
2006-01-11 23:47:00|2006-01-11 06:57:01"
)

df <- df[,lapply(.SD, function_addOneDay)]

df
#               DepDate             ArrDate
# 1 2006-01-12 22:56:00 2006-01-12 06:55:01
# 2 2006-01-12 23:47:00 2006-01-12 06:57:01

CodePudding user response:

You said "based on conditions", so I'll add a row where the conditions do not hold. Here's a solution, using as.POSIXlt as suggested in the comments:

library(data.table)
DT
#                DepDate             ArrDate
#                 <POSc>              <POSc>
# 1: 2006-01-11 22:56:00 2006-01-11 06:55:01
# 2: 2006-01-11 23:47:00 2006-01-11 06:57:01
# 3: 2006-01-11 02:56:00 2006-01-11 06:55:01

DT[DepDate > ArrDate,
   ArrDate := as.POSIXct(with(list(r = as.POSIXlt(ArrDate)), { r$mday = r$mday 1; r; })) ]
DT
#                DepDate             ArrDate
#                 <POSc>              <POSc>
# 1: 2006-01-11 22:56:00 2006-01-12 06:55:01
# 2: 2006-01-11 23:47:00 2006-01-12 06:57:01
# 3: 2006-01-11 02:56:00 2006-01-11 06:55:01

Data

DT <- setDT(structure(list(DepDate = structure(c(1137038160, 1137041220, 1136966160), class = c("POSIXct", "POSIXt"), tzone = ""), ArrDate = structure(c(1136980501, 1136980621, 1136980501), class = c("POSIXct", "POSIXt"), tzone = "")), row.names = c(NA, -3L), class = c("data.table", "data.frame"))) 
  • Related