Home > OS >  How to convert int value from csv to datetime in Spark SQL?
How to convert int value from csv to datetime in Spark SQL?

Time:10-29

There is such Spark SQL query:

spark.sql("""SELECT date, delay, origin, destination 
FROM us_delay_flights_tbl 
WHERE delay > 120 AND ORIGIN = 'SFO' AND DESTINATION = 'ORD' 
ORDER by delay DESC""").show(10)

View us_delay_flights_tbl was created from next csv:

date,delay,distance,origin,destination

01011245,6,602,ABE,ATL

01020600,-8,369,ABE,DTW

01021245,-2,602,ABE,ATL

01020605,-4,602,ABE,ATL

01031245,-4,602,ABE,ATL

01030605,0,602,ABE,ATL

How to convert field date to date in next format: 02-19 09:25 am from string 02190925 inside my sql query?

Now, result looks like:

date delay origin destination
02190925 1638 SFO ORD
01031755 396 SFO ORD

I need to get:

date delay origin destination
02-19 09:25 am 1638 SFO ORD
01-03 05:55 pm 396 SFO ORD

UPDATE

I tried this code:

date_format(to_date(date, 'MMddHHmm'), 'MM-dd hh:mm a')

But got next output for string 02190925:

02-19 12:00 AM

In other words, I lost time.

CodePudding user response:

to_date transforms a string to a date, meaning all the "time" part (hours/minutes/seconds) is lost. You should use to_timestamp function instead of to_date, as follows:

date_format(to_timestamp(date, 'MMddHHmm'), 'MM-dd hh:mm a')
  • Related