In my Databricks
notebook, I am getting ParseException
in the last line of the code below when converting string to Date data type. The column in csv
file does correctly have hiring_date
in a date format.
Question: What I may be doing wrong here and how can we fix the error?
Remark: I am using python
and NOT scala. I do not know scala.
from pyspark.sql.functions import *
df = spark.read.csv(".../Test/MyFile.csv", header="true", inferSchema="true")
df2 = df.withColumn("hiring_date",df["hiring_date"].cast('DateType'))
CodePudding user response:
If it is the last line of your code, with reference to this doc, the code should be modified as follows:
df2 = df.withColumn("hiring_date", df.hiring_date.cast(DateType()))
It seems you put a wrong value for cast function.
The following code would work as well:
df2 = df.withColumn("hiring_date", df["hiring_date"].cast('Date'))