I working with Spark 1.6 and I want to convert a String Value to a Datetime Value. However, none of the suggestions I read here so far worked, it has probably to do with my old version and the weird datetime representation in my string.
It looks like this:
df = spark.createDataFrame([('2011-11-17-12.00.46.841219',)], ['time'])
df.show(truncate = False)
How do I get a nice yyyy-MM-dd HH:mm:ss
datetime format? I tried this, but it did not work.
Best regards
CodePudding user response:
you can use from_unixtime(unix_timestamp(<ts_column>, <ts_format>))
, but I believe you'll lose the fraction of seconds in the conversion.
spark.createDataFrame([('2011-11-17-12.00.46.841219',)], ['ts_str']). \
withColumn('ts',
func.from_unixtime(func.unix_timestamp('ts_str', 'yyyy-MM-dd-HH.mm.ss'))
). \
show(truncate=False)
# -------------------------- -------------------
# |ts_str |ts |
# -------------------------- -------------------
# |2011-11-17-12.00.46.841219|2011-11-17 12:00:46|
# -------------------------- -------------------
CodePudding user response:
You can use to_timestamp()
to convert a string to a timestamp with a custom format.
from pyspark.sql.functions import col, to_timestamp
df.withColumn(to_timestamp(col("time), "yyyy-MM-dd-HH.mm.ss.SSSSSS")).show()