Home > Enterprise >  Converting a string to Timestamp with Pyspark
Converting a string to Timestamp with Pyspark

Time:03-10

I am currently attempting to convert a column "datetime" which has values that are dates/times in string form, and I want to convert the column such that all of the strings are converted to timestamps.

The date/time strings are of the form "10/11/2015 0:41", and I'd like to convert the string to a timestamp of form YYYY-MM-DD HH:MM:SS. At first I attempted to cast the column to timestamp in the following way:

df=df.withColumn("datetime", df["datetime"].cast("timestamp"))

Though when I did so, I received null for every value, which lead me to believe that the input dates needed to be formatted somehow. I have looked into numerous other possible remedies such as to_timestamp(), though this also gives the same null results for all of the values. How can a string of this format be converted into a timestamp?

Any insights or guidance are greatly appreciated.

CodePudding user response:

Try:

import datetime

def to_timestamp(date_string):
    return datetime.datetime.strptime(date_string, "%m/%d/%Y %H:%M")

df = df.withColumn("datetime", to_timestamp(df.datetime))

CodePudding user response:

You can use the to_timestamp function. See Datetime Patterns for valid date and time format patterns.

df = df.withColumn('datetime', F.to_timestamp('datetime', 'M/d/y H:m'))
df.show(truncate=False)

CodePudding user response:

You were doing it in the right way, except you missed to add the format ofstring type which is in this case MM/dd/yyyy HH:mm. Here M is used for months and m is used to detect minutes. Having said that, see the code below for reference -

df = spark.createDataFrame([('10/11/2015 0:41',), ('10/11/2013 10:30',), ('12/01/2016 15:56',)], ("String_Timestamp", ))
from pyspark.sql.functions import *

df.withColumn("Timestamp_Format", to_timestamp(col("String_Timestamp"), "MM/dd/yyyy HH:mm")).show(truncate=False)

 ---------------- ------------------- 
|String_Timestamp|   Timestamp_Format|
 ---------------- ------------------- 
| 10/11/2015 0:41|2015-10-11 00:41:00|
|10/11/2013 10:30|2013-10-11 10:30:00|
|12/01/2016 15:56|2016-12-01 15:56:00|
 ---------------- ------------------- 
  • Related