Home > Software design >  JavaSparkListener not found when running hive with spark engine
JavaSparkListener not found when running hive with spark engine

Time:10-17

Hive version: 2.0.0
Spark 2.3.0
Yarn as the scheduler.

It's not compatible out of the box but I've had to set the below configs to make it compatible.

spark.sql.hive.metastore.version 2.0.0
spark.sql.hive.metastore.jars /usr/local/apache-hive-2.0.0-bin/lib/*

I am able to successfully run hive queries on the spark cluster using spark-sql. However, when I run a query using the hive cli, I face the below error (as seen in hive logs):

2021-10-17T03:06:53,727 INFO  [1ff8e619-80bb-46ea-9fd0-824d57ea3799 1ff8e619-80bb-46ea-9fd0-824d57ea3799 main]: client.SparkClientImpl (SparkClientImpl.java:startDriver(428)) - Running client driver with argv: /usr/local/spark/bin
/spark-submit --properties-file /tmp/spark-submit.255205804744246105.properties --class org.apache.hive.spark.client.RemoteDriver /usr/local/apache-hive-2.0.0-bin/lib/hive-exec-2.0.0.jar --remote-host <masked_hostname> --remote-port 34537 --conf hive.spark.client.connect.timeout=1000 --conf hive.spark.client.server.connect.timeout=90000 --conf hive.spark.client.channel.log.level=null --conf hive.spark.client.rpc.max.size=52428800 --conf hive.spark.client.rpc.threads=8 --conf hive.spark.client.secret.bits=256
2021-10-17T03:06:54,488 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.server.connect.timeout=90000
2021-10-17T03:06:54,489 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.rpc.threads=8
2021-10-17T03:06:54,489 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.connect.timeout=1000
2021-10-17T03:06:54,489 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.secret.bits=256
2021-10-17T03:06:54,489 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.rpc.max.size=52428800
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - java.lang.NoClassDefFoundError: org/apache/spark/JavaSparkListener
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.defineClass1(Native Method)
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.security.AccessController.doPrivileged(Native Method)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.Class.forName0(Native Method)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.Class.forName(Class.java:348)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.util.Utils$.classForName(Utils.scala:235)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:836)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-10-17T03:06:55,003 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Caused by: java.lang.ClassNotFoundException: org.apache.spark.JavaSparkListener
2021-10-17T03:06:55,003 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
2021-10-17T03:06:55,003 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
2021-10-17T03:06:55,003 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)

I have also added spark libraries to the hive classpath using Spark as execution engine with Hive

Any suggestions how to fix the above error?

CodePudding user response:

I would recommend to use dependencies graph using gradle/maven. And exclude any ambiguity dependencies if you have. Because it looks like dependencies not properly added or the version of depended jar is overlapping while execution.

CodePudding user response:

2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - java.lang.NoClassDefFoundError: org/apache/spark/JavaSparkListener

This means Java Compatible error. Pls check Java version. I recommend use java 1.8 version only. Check dependencies as well. https://issues.apache.org/jira/browse/HIVE-14029

  • Related