Home > OS >  convert timestamp to time in Spark SQL
convert timestamp to time in Spark SQL

Time:12-31

I have a column in the format yyyy-MM-dd HH:mm:ss. I want to convert it to time in the following format: HH:mm:ss

I was trying the following function but it didn't work out:

date_format(date, 'HH:mm:ss')

How can I achieve this in spark SQL?

my code :

date_format(schedule_date, 'HH:mm:ss') 

input: 13-12-2021 4:05:00 AM

expected output: 4:05:00 AM

CodePudding user response:

Your input column schedule_date seems to be of type string, you need to convert it to TimestampType before using date_format function:

import pyspark.sql.functions as F

df = spark.createDataFrame([('13-12-2021 4:05:00 AM',)], ['schedule_date'])

df.withColumn(
    'schedule_time',
    F.date_format(F.to_timestamp('schedule_date', 'dd-MM-yyy H:mm:ss a'), 'H:mm:ss a')
).show(truncate=False)

# --------------------- ------------- 
#|schedule_date        |schedule_time|
# --------------------- ------------- 
#|13-12-2021 4:05:00 AM|4:05:00 AM   |
# --------------------- ------------- 

Spark SQL equivalent query:


select date_format(to_timestamp(schedule_date, 'dd-MM-yyy H:mm:ss a'), 'H:mm:ss a') as schedule_time
from   df

CodePudding user response:

You could use below.
# Option 1

import pyspark.sql.functions as F

df = spark.createDataFrame([('13-12-2021 4:05:00 PM',)], ['schedule_date'])

df.withColumn(
    'schedule_time',
    F.date_format(F.to_timestamp('schedule_date', 'dd-MM-yyy hh:mm:ss aa'), 'hh:mm:ss aa')
).show(truncate=False)

# Option 2

import pyspark.sql.functions as F
df = spark.createDataFrame([('13-12-2021 4:05:00 AM',)], ['schedule_date'])
df.withColumn(
    'schedule_time',
    F.from_unixtime(F.unix_timestamp('schedule_date', 'dd-MM-yyy hh:mm:ss aa'), 'hh:mm:ss aa')
).show(truncate=False)
  • Related