my classpath is missing serializable and cloneable classes.. i am not sure how to fix this.
i have a sbt application which looks like this
name := "realtime-spark-streaming"
version := "0.1"
resolvers = "confluent" at "https://packages.confluent.io/maven/"
resolvers = "Public Maven Repository" at "https://repository.com/content/repositories/pangaea_releases"
val sparkVersion = "3.2.0"
// https://mvnrepository.com/artifact/org.apache.spark/spark-core
libraryDependencies = "org.apache.spark" %% "spark-core" % "3.2.0"
// https://mvnrepository.com/artifact/org.apache.spark/spark-streaming
libraryDependencies = "org.apache.spark" %% "spark-streaming" % "3.2.0"
libraryDependencies = "org.apache.spark" %% "spark-sql" % "3.2.0"
libraryDependencies = "com.walmart.grcaml" % "us-aml-commons" % "latest.release"
libraryDependencies = "org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion
//libraryDependencies = "org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "3.2.0" % "2.1.3"
//libraryDependencies = "org.slf4j" % "slf4j-simple" % "1.7.12"
// https://mvnrepository.com/artifact/org.apache.kafka/kafka
libraryDependencies = "org.apache.kafka" %% "kafka" % "6.1.0-ccs"
resolvers = Resolver.mavenLocal
scalaVersion := "2.13.6"
when i do a sbt build i am getting..
Symbol 'type scala.package.Serializable' is missing from the classpath.
This symbol is required by 'class org.apache.spark.sql.SparkSession'.
Make sure that type Serializable is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'SparkSession.class' was compiled against an incompatible version of scala.package.
import org.apache.spark.sql.{DataFrame, SparkSession}
Symbol 'type scala.package.Serializable' is missing from the classpath.
This symbol is required by 'class org.apache.spark.sql.Dataset'.
Make sure that type Serializable is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'Dataset.class' was compiled against an incompatible version of scala.package.
def extractData(spark: SparkSession, configDetails: ReadProperties, pcSql: String, query: String): DataFrame = {
my dependency tree only shows jars, but this seems to be a class/package conflict or missing..
CodePudding user response:
You're using an incompatible Scala version (2.13.6). From the Spark documentation:
Spark runs on Java 8/11, Scala 2.12, Python 3.6 and R 3.5 . Python 3.6
support is deprecated as of Spark 3.2.0. Java 8 prior to version 8u201 support
is deprecated as of Spark 3.2.0. For the Scala API, Spark 3.2.0 uses Scala 2.12.
You will need to use a compatible Scala version (2.12.x).
If you use a Scala version from the 2.12.x family you should be fine.