I have 2 date format(MM/dd/yy HH:mm
and yyyy-mm-dd HH:mm:ss
) in start time that needs to convert into yyyy-mm-dd HH:mm
format. How do I handle both data format in single below select statement to convert into desired format
df1 = spark.sql("""select from_unixtime(unix_timestamp(strt_tm,'MM/dd/yy HH:mm'),'yyyy-mm-dd HH:mm) as starttime from table1""")
Input
strt_tm
12/11/21 01:15
2021-12-11 11:15:12
output:
strt_tm
2021-12-11 01:15
2021-12-11 11:15
CodePudding user response:
Use coalesce
to handle both formats along with to_timestamp
or to_date
functions:
spark.createDataFrame(
[("12/11/21 01:15",), ("2021-12-11 11:15:12",)], ["strt_tm"]
).createOrReplaceTempView("table1")
spark.sql("""
select coalesce(
to_timestamp(strt_tm, 'dd/MM/y HH:mm'),
to_timestamp(strt_tm, 'yyyy-MM-dd HH:mm:ss')
) as start_time,
coalesce(
to_date(strt_tm, 'dd/MM/y HH:mm'),
to_date(strt_tm, 'yyyy-MM-dd HH:mm:ss')
) as start_date
from table1
""").show()
# ------------------- ----------
#| start_time|start_date|
# ------------------- ----------
#|2021-11-12 01:15:00|2021-11-12|
#|2021-12-11 11:15:12|2021-12-11|
# ------------------- ----------