Home > Software engineering >  Pyspark convert string to timestamp
Pyspark convert string to timestamp

Time:03-07

Need to convert string column in format '12/1/2010 8:26' into timestamp. Try to use following code:

F.to_timestamp(dataset.InvoiceDate,'MM/dd/yyyy HH:mm')

but get an error

Py4JJavaError: An error occurred while calling o640.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 123.0 failed 1 times, most recent failure: Lost task 0.0 in stage 123.0 (TID 119) (13c59da6fb19 executor driver): org.apache.spark.SparkUpgradeException: You may get a different result due to the upgrading of Spark 3.0: Fail to parse '12/1/2010 8:26' in the new parser. You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat it as an invalid datetime string.

How can I convert string to timestamp in this case?

CodePudding user response:

try:

F.to_timestamp(dataset.InvoiceDate,'M/d/y H:m')

  • Related