Eclipse run directly inside the spark Java program problem-CodePudding

I installed in the Linux spark, single node, the master node is worker, IP is 192.168.90.74 use spark - when the shell is very normal, I think in the eclipse development spark program and debug run directly, but encountered various problems, to consult, I program is running in my notebook, IP is 192.168.90.88, my test procedure is as follows:

 System. SetProperty (" user name ", "webuser"); 
JavaSparkContext ct=new JavaSparkContext (" spark://192.168.90.74:7077 ", "test - 1", "/home/webuser/spark/spark - 1.5.2 bi - n - hadoop2.4", "C://newWorkSpace/Java. Spark. The test/target/Java. The spark. The test - 0.0.1 - the SNAPSHOT. Jar"); 
List the List=new ArrayList (); 
List. The add (1); 
List. The add (6); 
List. The add (9); 
JavaRDD RDD=ct. Parallelize (list); 
System. The out. Println (RDD. Collect ()); 
RDD. SaveAsTextFile ("/home/webuser/temp "); 
Ct. The close ();

1. The run time to add the jar package, here is the use of sparkContext constructor added, could you tell me the jar on the package must be uploaded to the master in advance and then use the path on the master?
2. When I put the jar path specified as maste jars on the path of the program can run, until an error can not find the drive c, after the operation and new problems, and program success with collect results, but the saveAsTextFile result is not correct, he in my development of the computer's disk c created under/home/webuser/temp folder, not running on the server of the spark to create, what is this principle? I understand the action of RDD should be run in the worker nodes, and the place where I program is running is a driver, is what he would create a file rather than the worker on driver?

CodePudding user response:

This what path with Linux and Windows, don't understand

CodePudding user response:

Tech support