Home > other >  How can I convert a specific string date to date or datetime in Spark?
How can I convert a specific string date to date or datetime in Spark?

Time:09-20

I have this string pattern in my Spark dataframe: 'Sep 14, 2014, 1:34:36 PM'.

I want to convert this to date or datetime format, using Databricks and Spark.

I've already tried the cast and to_date functions, but nothing works and I got null return everytime.

How can I do that?

Thanks in advance!

CodePudding user response:

If we have a created table like this:

var ds = spark.sparkContext.parallelize(Seq(
  "Sep 14, 2014, 01:34:36 PM"
)).toDF("date")

Through the following statement:

ds = ds.withColumn("casted", to_timestamp(col("date"), "MMM dd, yyyy, hh:mm:ss aa"))

You get this result:

 ------------------------- ------------------- 
|date                     |casted             |
 ------------------------- ------------------- 
|Sep 14, 2014, 01:34:36 PM|2014-09-14 13:34:36|
 ------------------------- ------------------- 

which should be useful to you. You can use to_date or other APIs that require a datetime format, good luck!

CodePudding user response:

Your date/time stamp string is incorrect. You have 1 instead of 01.

#
# 1 - Create sample dataframe   view
#

# required library
from pyspark.sql.functions import *

# array of tuples - data
dat1 = [
  ("1", "Sep 14, 2014, 01:34:36 pm")
]

# array of names - columns
col1 = ["row_id", "date_string1"]

# make data frame
df1 = spark.createDataFrame(data=dat1, schema=col1)

# expand date range into list of dates
df1 = df1.withColumn("time_stamp1", to_timestamp(col("date_string1"), "MMM dd, yyyy, hh:mm:ss a")) 
  
# show schema
df1.printSchema()

# show data
display(df1)

This code produces the correct answer.

enter image description here

If the data has 1:34:36, it fails. You can use a when clause to pick the correct conversion.

  • Related