I'm trying to cast the following string to timestamp in pyspark:
"30-Jun-2022 14:00:00"
I've tried the following approaches:
f.col("date_string").cast("timestamp"),
f.to_timestamp(f.col("date_string")).alias("date_string") ,
.withColumn(
"date_string",
f.to_timestamp(f.col("date_string")
)
But all of them return a null
column, what am I doing wrong?
MVCE:
data = [
("30-Jun-2022 14:00:00"),
("25-Jul-2022 11:00:00"),
("10-May-2022 12:00:00"),
("11-Jan-2022 09:00:00")
]
schema = StructType([
StructField("date_string", StringType(),True)
])
df = spark.createDataFrame(data=data,schema=schema)
CodePudding user response:
I do not have a testing environment for pyspark, but in Spark, this:
.withColumn("timestamped",to_timestamp(col("name"), "dd-MMM-yyyy HH:mm:ss"));
returns this (which I assume is what you want):
name,timestamped
30-Jun-2022 14:00:00,2022-06-30 14:00:00