I have been struggling with this NoSuchMethodError
in a Spark project for a while now without getting anywhere. Currently, this project is running locally using SparkNLP 3.3.0 and Spark-Core/SparkMLLib 3.1.2, both with Scala 2.12.4. Hadoop 3.2.0 is pulled in as a transitive dependency via spark-core.
What I have tried so far:
- check that this method is indeed present by stepping through the code
- verify uniform Scala version across all dependencies
- verify that spark and hadoop versions are the same throughout (using maven dep tree and enforcer plug-in)
- manually remove other versions of Hadoop from local .m2 directory
The code is running from an executable JAR which pulls in other jars to the classpath that are provided at runtime. Java version is 1.8.0_282. Maven is version 3.6.3. OS is Big Sur, 11.6 (M1).
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.getPassword(Ljava/lang/String;)[C
at org.apache.spark.SSLOptions$.$anonfun$parse$8(SSLOptions.scala:188)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.SSLOptions$.parse(SSLOptions.scala:188)
at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:98)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:252)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:189)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:277)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:458)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2672)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939)
[...]
at com.mymainpackage.Main.main(Main.java:157)
CodePudding user response:
I have finally been able to figure this one out.
The root cause was that an older version of hadoop-core was being pulled in (1.2.1 instead of 2.6.5), which does in fact not have the Configuration.getPassword()
method. I found this out after setting up a test project where the SparkContext was initialized correctly and then checking the source jars of the two Configuration classes in the two projects (using Configuration.class.getProtectionDomain().getCodeSource().getLocation().getPath()
).
After forcing version 2.6.5 using Maven's dependency management and after manually deleting the older 1.2.1 jar from the local Maven repo, it work fine.
The only thing that I still don't understand is why the hadoop-core
was not showing up in the Maven dependency tree. Otherwise, I would have found sooner (probably).