Home > OS >  Zeppelin Spark Cassandra: Spark dont work
Zeppelin Spark Cassandra: Spark dont work

Time:12-12

Watched one nice youtube video about Zeppelin Spark Cassandra. Trying to repeat. OS Win10.

  1. Runned Zeppelin like a docker image ;

  2. Setuped options for Cassandra Interpreters, it works

  3. Now trying to setup Spark, and i cant. Installed spark-3.0.1-bin-hadoop2.7 (folder named spark-3.0.1-bin-hadoop2.7, it is ok), spark-shell from cmd works. What i have to do with spark-cassandra-connector and what options i have to setup for spark Interpreters? Thanks.

org.apache.zeppelin.interpreter.InterpreterException: java.io.IOException: Fail to detect scala version, the reason is:Cannot run program "C:/bin/spark-3.3.1-bin-hadoop3/bin/spark-submit": error=2, No such file or directory at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:129) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:271) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:438) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:69) at org.apache.zeppelin.scheduler.Job.run(Job.java:172) at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:182) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Fail to detect scala version, the reason is:Cannot run program "C:/bin/spark-3.3.1-bin-hadoop3/bin/spark-submit": error=2, No such file or directory at org.apache.zeppelin.interpreter.launcher.SparkInterpreterLauncher.buildEnvFromProperties(SparkInterpreterLauncher.java:127) at org.apache.zeppelin.interpreter.launcher.StandardInterpreterLauncher.launchDirectly(StandardInterpreterLauncher.java:77) at org.apache.zeppelin.interpreter.launcher.InterpreterLauncher.launch(InterpreterLauncher.java:110) at org.apache.zeppelin.interpreter.InterpreterSetting.createInterpreterProcess(InterpreterSetting.java:856) at org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:66) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:104) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:154) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:126) ... 13 more

CodePudding user response:

Ok guys, here we go:

  1. Install Spark on Win10, there are many tutorials in internet. My version 3.0.1
  2. Download docker image with Zeppelin
  3. In image settings setuped path folder with Spark and port 8080, lounch it http://localhost:8080/
  4. Spark interpreter settings: set SPARK_HOME like in prev point 3, spark.jars.packages = com.datastax.spark:spark-cassandra-connector_2.12:3.0.1. Add settings for Cassandra: spark.cassandra.connection.host, spark.cassandra.auth.username, spark.cassandra.auth.password.
  5. Welcome
  • Related