I have the following overloaded method which input can be a Option[String]
or Option[Seq[String]]
:
def parse_emails(email: => Option[String]) : Seq[String] = {
email match {
case Some(e : String) if e.isEmpty() => null
case Some(e : String) => Seq(e)
case _ => null
}
}
def parse_emails(email: Option[Seq[String]]) : Seq[String] = {
email match {
case Some(e : Seq[String]) if e.isEmpty() => null
case Some(e : Seq[String]) => e
case _ => null
}
}
I want to use this method from Spark, so I tried to wrap them as a udf:
def parse_emails_udf = udf(parse_emails _)
But I am getting the following error:
error: ambiguous reference to overloaded definition,
both method parse_emails of type (email: Option[Seq[String]])Seq[String]
and method parse_emails of type (email: => Option[String])Seq[String]
match expected type ?
def parse_emails_udf = udf(parse_emails _)
Is it possible to define a udf which could wrap both alternative?
Or could it be possible to create two udfs with same name each pointing to one of the overloaded options? I tried below approach, but throws another error:
def parse_emails_udf = udf(parse_emails _ : Option[Seq[String]])
error: type mismatch;
found : (email: Option[Seq[String]])Seq[String] <and> (email: => Option[String])Seq[String]
required: Option[Seq[String]]
def parse_emails_udf = udf(parse_emails _ : Option[Seq[String]])
CodePudding user response:
Option[String]
and Option[Seq[String]]
have the same erasure Option
, so even if Spark supported udf overloading it wouldn't work.
What you can do is create one function that accepts anything, then match on the argument and handle the different cases:
def parseEmails(arg: Option[AnyRef]) = arg match {
case Some(x) =>
x match {
case str: String =>
??? // todo
case s: Seq[String] =>
??? // todo
case _ =>
throw new IllegalArgumentException()
}
case None =>
??? // todo
}