Home > Mobile >  How to compute for duration in spark-scala
How to compute for duration in spark-scala

Time:06-09

I have 2 columns which was converted from epoch format:

val df2 = df1.withColumn("event_end_ts", from_unixtime($"end_ts"/1000, "yyyy/MM/dd hh:mm:ss:ss"))
.withColumn("event_start_ts", from_unixtime($"start_ts"/1000, "yyyy/MM/dd hh:mm:ss:ss"))

which gives me this:

 ---------------------- ----------------------  
|event_end_ts          |event_start_ts        |
 ---------------------- ----------------------  
|2022/05/24 03:49:01:01|2022/05/24 03:48:50:50| 
|2022/05/24 03:49:00:00|2022/05/24 03:48:51:51| 
|2022/05/24 03:50:03:03|2022/05/24 03:49:05:05|
 ---------------------- ---------------------- 

Now,I am trying to get the duration of 2 columns. I've tried this, but it's giving a null results:

df2.withColumn("time_diff", (to_timestamp($"event_end_ts") - to_timestamp($"event_start_ts"))/3600)

CodePudding user response:

you need to cast them to LongType first, like this:

df2.withColumn("time_diff", (to_timestamp($"event_end_ts").cast(LongType) - to_timestamp($"event_start_ts").cast(LongType))/3600)
  • Related