Home > other >  Ask how to use scala in Spark. Tools. NSC. Interpreter. IMain
Ask how to use scala in Spark. Tools. NSC. Interpreter. IMain

Time:09-26

Hi everyone.

I do now want to use a project configuration to Spark processing flow, ideally, will Spark in the call of RDD function, according to the order of definition in the configuration file, and then put these functions together well in the Spark program, stored in a string variable, then by the parser to execute this string, similar to the Perl language used in the eval to invoke the code snippet, but at run time, the program error, a great god, please help to look at, thank you very much!

For example: the user defines the following steps (word count) :

1) textFile (" files///& lt; File_full_path & gt;" )
(2) flatMap line=& gt; Line. The split (" "))
3) the map (word=& gt; (word, 1))
(4) reduceByKey _ + _)
5) foreach (println)

The above steps defined in the configuration file, and then the Spark in the driver will put them together into the following string, and stored in a variable processFlow:

Val processFlow=
"" "
Sc. TextFile (" file:///input.txt "). FlatMap (line=& gt; Line. The split (" ")). The map (word=& gt; 1), (word) reduceByKey (+ _ _). The foreach (println)
"" "

Then, the Spark will invoke the above variables, as defined by code execution,

The following is my source code,

The import org. Apache. Spark. SparkConf
The import org. Apache. Spark. SparkContext
Import scala. Collection. The mutable. {Map, ArraySeq}
The import scala. Tools. NSC. GenericRunnerSettings
The import scala. Tools. NSC. Interpreter. IMain
The class TestMain {
Def the exec () : the Unit={
Val out=System. Out
Val flusher=new Java. IO. PrintWriter (out)
Val interpreter={
Val Settings=new GenericRunnerSettings (println _)
Settings. Usejavacp. Value=https://bbs.csdn.net/topics/true
New IMain (Settings, flusher)
}
Val conf=new SparkConf (). SetAppName (" TestMain ")
Val sc=new SparkContext (conf)
Val methodChain=
"" "
Val textFile=sc. TextFile (" file:///input.txt ")
TextFile. FlatMap (line=& gt; Line. The split (" ")). The map (word=& gt; 1), (word) reduceByKey (+ _ _). The foreach (println)
"" "
Interpreter. Bind (" sc ", sc);
Val resultFlag=interpreter. Interpret (methodChain)
}
}
The object TestMain {
Def main (args: Array [String]) {
Val testMain=new testMain ()
TestMain. The exec ()
System. The exit (0)
}
}

Programs run times wrong, below is the error log,

Sc: org. Apache. Spark. SparkContext=org. Apache. Spark. SparkContext @ 7 d87addd
Org. Apache. Spark. SparkException: Job aborted due to stage a failure: Task 0 0.0 failed in stage 1 times, most recent failure: Lost Task in stage 0.0 0.0 (dar 0, localhost) : Java. Lang. ClassNotFoundException: $anonfun $1
The at java.net.URLClassLoader$1.run URLClassLoader. Java: (366)
The at java.net.URLClassLoader$1.run URLClassLoader. Java: (355)
The at Java. Security. The AccessController. DoPrivileged (Native Method)
The at java.net.URLClassLoader.findClass URLClassLoader. Java: (354)
The at Java. Lang. This. LoadClass (425). This Java:
The at Java. Lang. This. LoadClass (358). This Java:
The at Java. Lang. Class. ForName0 (Native Method)
The at Java. Lang. Class.forname (270) Class. Java:
The at org. Apache. Spark. Serializer. JavaDeserializationStream $$$1. -anon resolveClass (JavaSerializer. Scala: 68)
The at Java. IO. ObjectInputStream. ReadNonProxyDesc (ObjectInputStream. Java: 1612)
The at Java. IO. ObjectInputStream. ReadClassDesc (ObjectInputStream. Java: 1517)
The at Java. IO. ObjectInputStream. ReadOrdinaryObject (ObjectInputStream. Java: 1771)
The at Java. IO. ObjectInputStream. ReadObject0 (ObjectInputStream. Java: 1350)
The at Java. IO. ObjectInputStream. DefaultReadFields (ObjectInputStream. Java: 1990)
The at Java. IO. ObjectInputStream. ReadSerialData (ObjectInputStream. Java: 1915)
The at Java. IO. ObjectInputStream. ReadOrdinaryObject (ObjectInputStream. Java: 1798)
The at Java. IO. ObjectInputStream. ReadObject0 (ObjectInputStream. Java: 1350)
The at Java. IO. ObjectInputStream. DefaultReadFields (ObjectInputStream. Java: 1990)
The at Java. IO. ObjectInputStream. ReadSerialData (ObjectInputStream. Java: 1915)
The at Java. IO. ObjectInputStream. ReadOrdinaryObject (ObjectInputStream. Java: 1798)
The at Java. IO. ObjectInputStream. ReadObject0 (ObjectInputStream. Java: 1350)
The at Java. IO. ObjectInputStream. DefaultReadFields (ObjectInputStream. Java: 1990)
The at Java. IO. ObjectInputStream. ReadSerialData (ObjectInputStream. Java: 1915)
The at Java. IO. ObjectInputStream. ReadOrdinaryObject (ObjectInputStream. Java: 1798)
The at Java. IO. ObjectInputStream. ReadObject0 (ObjectInputStream. Java: 1350)
The at Java. IO. ObjectInputStream. ReadObject (ObjectInputStream. Java: 370)
The at org. Apache. Spark. Serializer. JavaDeserializationStream. ReadObject (JavaSerializer. Scala: 76)
The at org. Apache. Spark. Serializer. JavaSerializerInstance. Deserialize (JavaSerializer. Scala: 115)
The at org. Apache. Spark. The scheduler. ResultTask. RunTask (ResultTask. Scala: 61)
At org. Apache. Spark. The scheduler. Task. Run (89) Task. Scala:
The at org. Apache. Spark. Executor. $TaskRunner executor. Run (executor. Scala: 214)
The at Java. Util. Concurrent. ThreadPoolExecutor. RunWorker (ThreadPoolExecutor. Java: 1145)
The at Java. Util. Concurrent. ThreadPoolExecutor $Worker. The run (ThreadPoolExecutor. Java: 615)
The at Java. Lang. Thread. The run (Thread. Java: 745)

Driver stacktrace:
The at org.apache.spark.scheduler.DAGScheduler.org $$$$$$failJobAndIndependentStages DAGScheduler scheduler spark apache (DAGScheduler. Scala: 1431)
The at org. Apache. Spark. The scheduler. DAGScheduler $$$abortStage anonfun $1. Apply (DAGScheduler. Scala: 1419)
nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull
  • Related