Home > Software design >  ParseException when changing data type in Spark Dataframe
ParseException when changing data type in Spark Dataframe

Time:05-11

In my Databricks notebook, I am getting ParseException in the last line of the code below when converting string to Date data type. The column in csv file does correctly have hiring_date in a date format.

Question: What I may be doing wrong here and how can we fix the error?

Remark: I am using python and NOT scala. I do not know scala.

from pyspark.sql.functions import *

df = spark.read.csv(".../Test/MyFile.csv", header="true", inferSchema="true")
df2 = df.withColumn("hiring_date",df["hiring_date"].cast('DateType'))

CodePudding user response:

If it is the last line of your code, with reference to this doc, the code should be modified as follows:

df2 = df.withColumn("hiring_date", df.hiring_date.cast(DateType()))

It seems you put a wrong value for cast function.

The following code would work as well:

df2 = df.withColumn("hiring_date", df["hiring_date"].cast('Date'))
  • Related