Home > other >  Job execution cannot be distributed
Job execution cannot be distributed

Time:10-01

I built Spark cluster in standalone mode, a total of five nodes, when working at run kmeans, execute the following command: bin/run - example org. Apache. Spark. Examples. SparkKMeans Spark://master: 7077 in. TXT 2, 0.0001,

Input data in the. TXT storage on HDFS, a total of four block, distributed on each node,

Open the web interface, found that all the RDD partitions on the same node, all the Executor in the same node, without the implementation of a distributed operation, consult everybody a great god, and what is this? Should not in each block, the node generated RDD? In every RDD the node starts the Executor and perform calculations?
  • Related