I am trying to filter the dataframe column on the basis of the datepattern so automatically only valid dates format get passed for further operation. The sample code is below
val datePattern = "\\d{2}-\\d{2}-\\d{4} \\d{2}:\\d{2}:\\d{2}"
val df1 = df // df is spark.read.csv dataframe
.filter($"timewithDate".toString.matches(datePattern))
//othercode
)
But I am getting below error. Cannot resolve overloaded method 'filter'
Can anyone please explain to me what I am doing wrong here and how to correctly resolve the error?
CodePudding user response:
filter
method expects a column as first parameter, but you are passing a Boolean
instead. You can cast a column using .cast
method but you can not explicitly convert that to string and apply string methods.
To fix your issue, you can use:
val df1 = df.filter(col("timewithDate").rlike(datePattern))
column's rlike
is the same as string's matches
in principle.
You can find more about rlike
here.