Home > Software design >  Unrecognized Hadoop major version number
Unrecognized Hadoop major version number

Time:02-02

I am trying to initialize an Apache Spark instance on Windows 10 to run a local test. My problem is during the initialization of the Spark instance, I get an error message. This code has worked for me a lot of times previously, so I am guessing something might have changed in the dependencies or the configuration. I am running using JDK version 1.8.0_192, Hadoop should be 3.0.0 and Spark version is 2.4.0. I am also using Maven as a build tool if that is relevant.

Here is the way I am setting up the session:

  def withSparkSession(testMethod: SparkSession => Any) {

    val uuid = UUID.randomUUID().toString

    val pathRoot = s"C:/data/temp/spark-testcase/$uuid" // TODO: make this independent from Windows
    val derbyRoot = s"C:/data/temp/spark-testcase/derby_system_root"

    // TODO: clear me up -- Derby based metastore should be cleared up
    System.setProperty("derby.system.home", s"${derbyRoot}")

    val conf = new SparkConf()
      .set("testcase.root.dir", s"${pathRoot}")
      .set("spark.sql.warehouse.dir", s"${pathRoot}/test-hive-dwh")
      .set("spark.sql.catalogImplementation", "hive")
      .set("hive.exec.scratchdir", s"${pathRoot}/hive-scratchdir")
      .set("hive.exec.dynamic.partition.mode", "nonstrict")
      .setMaster("local[*]")
      .setAppName("Spark Hive Test case")

    val spark = SparkSession.builder()
      .config(conf)
      .enableHiveSupport()
      .getOrCreate()

    try {
      testMethod(spark)
    }
    finally {
      spark.sparkContext.stop()
      println(s"Deleting test case root directory: $pathRoot")
      deleteRecursively(nioPaths.get(pathRoot))
    }
  }

And this is the error message I receive:

An exception or error caused a run to abort. 
java.lang.ExceptionInInitializerError
    at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.spark.util.Utils$.classForName(Utils.scala:238)
    at org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:1117)
    at org.apache.spark.sql.SparkSession$Builder.enableHiveSupport(SparkSession.scala:866)
.
.
.
    at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
    at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
    at org.scalatest.Transformer.apply(Transformer.scala:22)
    at org.scalatest.Transformer.apply(Transformer.scala:20)
    at org.scalatest.FunSpecLike$$anon$1.apply(FunSpecLike.scala:454)
    at org.scalatest.TestSuite$class.withFixture(TestSuite.scala:196)
    at org.scalamock.scalatest.AbstractMockFactory$$anonfun$withFixture$1.apply(AbstractMockFactory.scala:35)
    at org.scalamock.scalatest.AbstractMockFactory$$anonfun$withFixture$1.apply(AbstractMockFactory.scala:34)
    at org.scalamock.MockFactoryBase$class.withExpectations(MockFactoryBase.scala:41)
    at org.scalamock.scalatest.AbstractMockFactory$class.withFixture(AbstractMockFactory.scala:34)
    at org.scalatest.FunSpecLike$class.invokeWithFixture$1(FunSpecLike.scala:451)
    at org.scalatest.FunSpecLike$$anonfun$runTest$1.apply(FunSpecLike.scala:464)
    at org.scalatest.FunSpecLike$$anonfun$runTest$1.apply(FunSpecLike.scala:464)
    at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
    at org.scalatest.FunSpecLike$class.runTest(FunSpecLike.scala:464)
    at org.scalatest.FunSpec.runTest(FunSpec.scala:1630)
    at org.scalatest.FunSpecLike$$anonfun$runTests$1.apply(FunSpecLike.scala:497)
    at org.scalatest.FunSpecLike$$anonfun$runTests$1.apply(FunSpecLike.scala:497)
    at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396)
    at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
    at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:373)
    at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:410)
    at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
    at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379)
    at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
    at org.scalatest.FunSpecLike$class.runTests(FunSpecLike.scala:497)
    at org.scalatest.FunSpec.runTests(FunSpec.scala:1630)
    at org.scalatest.Suite$class.run(Suite.scala:1147)
    at org.scalatest.FunSpec.org$scalatest$FunSpecLike$$super$run(FunSpec.scala:1630)
    at org.scalatest.FunSpecLike$$anonfun$run$1.apply(FunSpecLike.scala:501)
    at org.scalatest.FunSpecLike$$anonfun$run$1.apply(FunSpecLike.scala:501)
    at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
    at org.scalatest.FunSpecLike$class.run(FunSpecLike.scala:501)
    at org.scalatest.FunSpec.run(FunSpec.scala:1630)
    at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
    at org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1346)
    at org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1340)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1340)
    at org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1011)
    at org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1010)
    at org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1506)
    at org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
    at org.scalatest.tools.Runner$.run(Runner.scala:850)
    at org.scalatest.tools.Runner.run(Runner.scala)
    at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2or3(ScalaTestRunner.java:43)
    at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:26)
Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.0.0-cdh6.3.4
    at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
    at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
    at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
    at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:368)
    ... 64 more


Process finished with exit code 2

So far I tried changing up the JDK versions to jdk1.8.0_181 and jdk11 28-x64. I also tried deleting the HADOOP_HOME environment variables from the system, but they didn't help. (Currently they are set to C:\Data\devtools\hadoop-win\3.0.0)

CodePudding user response:

If you're on windows, you shouldn't be pulling CDH dependencies, as Cloudera doesn't support Windows, last I checked.

But, you should be using Spark3, if you have Hadoop3 , and keep HADOOP_HOME, as that is definitely necessary.

Also, only Hadoop 3.3.4 has introduced Java 11 runtime support, so Java 8 is what you should stick with.

CodePudding user response:

I have solved the problem. During the project development we also added HBase to the build, which pulled in a different Hadoop version from Cloudera, so the versions got mixed up. Taking it out HBase dependency from the pom.xml solved the problem.

  • Related