Home > other >  Help: remote connection Spark on Yarn read mongo data error
Help: remote connection Spark on Yarn read mongo data error

Time:09-16

I am using the IDEA of development, make the IDEA of remote connections on a cluster of Spark on Yarn, and as input data from the mongo, procedure is as follows:
SparkSession spark=SparkSession. Builder ()
Master (" yarn ")
AppName (" MongoSparkConnectorIntro ")
Web.config (" spark mongo. Input. Uri ", "mongo: ://* * * * * * @ 192.168. * * *. * * * : 27017/database collection? AuthSource=admin ")
Web.config (" spark mongo. Input. Partitioner ", "MongoSplitVectorPartitioner")
Web.config (" spark mongo. Input. PartitionerOptions. PartitionSizeMB ", 128)
GetOrCreate ();

JavaSparkContext JSC=new JavaSparkContext (spark sparkContext ());
JavaRDD mongodata=https://bbs.csdn.net/topics/MongoSpark.load (JSC);
List Result=mongodata. Collect ();
For (Document r: result) {
System. The out. Println (r.t oString ());
}
JSC. Close ();
But the program always error:
Exception in the thread "main" org. Apache. Spark. SparkException: Job aborted due to stage a failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost Task in stage 0.3 0.0 (dar 3, * * * * *. The localdomain, executor (2) : Java. Lang. ClassNotFoundException: com. The mongo. Spark. RDD. Partitioner. MongoPartition
The at java.net.URLClassLoader.findClass URLClassLoader. Java: (382)
The at Java. Lang. This. LoadClass (424). This Java:
The at Java. Lang. This. LoadClass (357). This Java:
The at Java. Lang. Class. ForName0 (Native Method)
The at Java. Lang. Class.forname (348) Class. Java:
The at org. Apache. Spark. Serializer. JavaDeserializationStream $$$1. -anon resolveClass (JavaSerializer. Scala: 67)
The at Java. IO. ObjectInputStream. ReadNonProxyDesc (ObjectInputStream. Java: 1868)
The at Java. IO. ObjectInputStream. ReadClassDesc (ObjectInputStream. Java: 1751)
The at Java. IO. ObjectInputStream. ReadOrdinaryObject (ObjectInputStream. Java: 2042)
The at Java. IO. ObjectInputStream. ReadObject0 (ObjectInputStream. Java: 1573)
The at Java. IO. ObjectInputStream. DefaultReadFields (ObjectInputStream. Java: 2287)
The at Java. IO. ObjectInputStream. ReadSerialData (ObjectInputStream. Java: 2211)
The at Java. IO. ObjectInputStream. ReadOrdinaryObject (ObjectInputStream. Java: 2069)
The at Java. IO. ObjectInputStream. ReadObject0 (ObjectInputStream. Java: 1573)
The at Java. IO. ObjectInputStream. ReadObject (ObjectInputStream. Java: 431)
The at org. Apache. Spark. Serializer. JavaDeserializationStream. ReadObject (JavaSerializer. Scala: 75)
The at org. Apache. Spark. Serializer. JavaSerializerInstance. Deserialize (JavaSerializer. Scala: 114)
The at org. Apache. Spark. Executor. $TaskRunner executor. Run (executor. Scala: 370)
The at Java. Util. Concurrent. ThreadPoolExecutor. RunWorker (ThreadPoolExecutor. Java: 1149)
The at Java. Util. Concurrent. ThreadPoolExecutor $Worker. The run (ThreadPoolExecutor. Java: 624)
The at Java. Lang. Thread. The run (Thread. Java: 748)

Driver stacktrace:
The at org.apache.spark.scheduler.DAGScheduler.org $$$$$$failJobAndIndependentStages DAGScheduler scheduler spark apache (DAGScheduler. Scala: 1887)
The at org. Apache. Spark. The scheduler. DAGScheduler $$$abortStage anonfun $1. Apply (DAGScheduler. Scala: 1875)
The at org. Apache. Spark. The scheduler. DAGScheduler $$$abortStage anonfun $1. Apply (DAGScheduler. Scala: 1874)
At the scala. Collection. The mutable. ResizableArray $class. Foreach (ResizableArray. Scala: 59)
At the scala. Collection. Mutable. ArrayBuffer. Foreach (ArrayBuffer. Scala: 48)
The at org. Apache. Spark. The scheduler. DAGScheduler. AbortStage (DAGScheduler. Scala: 1874)
The at org. Apache. Spark. The scheduler. DAGScheduler $$$handleTaskSetFailed anonfun $1. Apply (DAGScheduler. Scala: 926)
The at org. Apache. Spark. The scheduler. DAGScheduler $$$handleTaskSetFailed anonfun $1. Apply (DAGScheduler. Scala: 926)
At the scala. Option. Foreach (257) Option. The scala:
The at org. Apache. Spark. The scheduler. DAGScheduler. HandleTaskSetFailed (DAGScheduler. Scala: 926)
The at org. Apache. Spark. The scheduler. DAGSchedulerEventProcessLoop. DoOnReceive (DAGScheduler. Scala: 2108)
The at org. Apache. Spark. The scheduler. DAGSchedulerEventProcessLoop. OnReceive (DAGScheduler. Scala: 2057)
The at org. Apache. Spark. The scheduler. DAGSchedulerEventProcessLoop. OnReceive (DAGScheduler. Scala: 2046)
The at org. Apache. Spark. Util. EventLoop $$$1. -anon run (49) EventLoop. Scala:
The at org. Apache. Spark. The scheduler. DAGScheduler. RunJob (DAGScheduler. Scala: 737)
The at org. Apache. Spark. SparkContext. RunJob (SparkContext. Scala: 2061)
The at org. Apache. Spark. SparkContext. RunJob (SparkContext. Scala: 2082)
The at org. Apache. Spark. SparkContext. RunJob (SparkContext. Scala: 2101)
The at org. Apache. Spark. SparkContext. RunJob (SparkContext. Scala: 2126)
The at org. Apache. Spark. RDD. RDD $$anonfun $collect $1. Apply (RDD. Scala: 945)
The at org. Apache. Spark. RDD. RDDOperationScope $. WithScope (RDDOperationScope. Scala: 151)
The at org. Apache. Spark. RDD. RDDOperationScope $. WithScope (RDDOperationScope. Scala: 112)
The at org. Apache. Spark. RDD. RDD. WithScope (RDD. Scala: 363)
The at org. Apache. Spark. RDD. RDD. Collect (RDD. Scala: 944)
The at org. Apache. Spark. API. Java. JavaRDDLike $class. Collect (JavaRDDLike. Scala: 361)
The at org. Apache. Spark. API. Java. AbstractJavaRDDLike. Collect (JavaRDDLike. Scala: 45)
nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull
  • Related