Home > Mobile >  Issue while converting string to DateTime using pyspark
Issue while converting string to DateTime using pyspark

Time:12-21

I am converting string dataframe to datetime using pyspark, here is my input,

 -------------- 
|        col1  |
 -------------- 
|18300031121994|
|18300031122018|
|12324031012020|
|19590031052020|
|19590030062020|
 -------------- 

Expected output,

col1
1994-12-31 18:30:00
2018-12-31 18:30:00
2020-01-31 12:32:40
2020-05-31 19:59:00
2020-06-30 19:59:00

here is my snippet,

df.select(col("col1"),to_date(col("col1"),"hhmmssMMddyyyy").alias("datetime")).show()

when I execute above code it gives the same output as input, Please help where I am going wrong

CodePudding user response:

You need to use the correct format. The correct format for the data you have provided is "hmmssddMMyyyy". Try this:

df.select(col("col1"),to_date(col("col1"),"hmmssddMMyyyy").alias("datetime")).show()

CodePudding user response:

Here try this :

from pyspark.sql.functions import to_timestamp

df = df.withColumn("timestamp", to_timestamp(df.timestamp_string, "yyyy-MM-dd HH:mm:ss"))
  • Related