Pyspark when calling to ModuleNotFoundError: No module named ' '-CodePudding

Start normal, under the current file from other files start calling this file will be an error, and a great god answered...

Error message:
The 2019-07-02 12:23:29 WARN TaskSetManager: 66 - Lost task in stage 0.0 4.0 (dar 5, 192.168.1.194, executor 0) : org. Apache. The spark. API. Python. PythonException: Traceback (the most recent call last) :
The File "/usr/local/spark/python/lib/pyspark. Zip/pyspark/worker. Py", line 217, the main in
Func, profiler, deserializer, serializer=read_command (pickleSer infile)
The File "/usr/local/spark/python/lib/pyspark. Zip/pyspark/worker. Py", line, 59 in read_command
The command=serializer. _read_with_length (file)
The File "/usr/local/spark/python/lib/pyspark. Zip/pyspark serializers. Py", line 170, in _read_with_length
Return the self. Loads (obj)
The File "/usr/local/spark/python/lib/pyspark. Zip/pyspark serializers. Py", line 559, loads in
Return pickle. Loads (obj, encoding=encoding)
ModuleNotFoundError: No module named 'pyspark_test'

The at org. Apache. Spark. API. Python. BasePythonRunner $ReaderIterator. HandlePythonException (PythonRunner. Scala: 298)
At org. Apache. Spark. API. Python. PythonRunner $$$1. -anon read (438). PythonRunner scala:
At org. Apache. Spark. API. Python. PythonRunner $$$1. -anon read (421). PythonRunner scala:
The at org. Apache. Spark. API. Python. BasePythonRunner $ReaderIterator. HasNext (PythonRunner. Scala: 252)
The at org. Apache. Spark. InterruptibleIterator. HasNext (37) InterruptibleIterator. Scala:
At the scala. Collection. The Iterator $class. Foreach (Iterator. Scala: 893)
The at org. Apache. Spark. InterruptibleIterator. Foreach (28) InterruptibleIterator. Scala:
At the scala. Collection. Generic. Growable $class. $plus $plus $eq (Growable. Scala: 59)
At the scala. Collection. Mutable. ArrayBuffer. $plus $plus $eq (ArrayBuffer. Scala: 104)
At the scala. Collection. Mutable. ArrayBuffer. $plus $plus $eq (ArrayBuffer. Scala: 48)
At the scala. Collection. TraversableOnce $class. To (310). TraversableOnce scala:
At org. Apache. Spark. InterruptibleIterator. To (28) InterruptibleIterator. Scala:
At the scala. Collection. TraversableOnce $class. ToBuffer (TraversableOnce. Scala: 302)
The at org. Apache. Spark. InterruptibleIterator. ToBuffer (28) InterruptibleIterator. Scala:
At the scala. Collection. TraversableOnce $class. ToArray (TraversableOnce. Scala: 289)
The at org. Apache. Spark. InterruptibleIterator. ToArray (28) InterruptibleIterator. Scala:
The at org. Apache. Spark. RDD. RDD $$anonfun $collect $1 $$anonfun $12. Apply (RDD. Scala: 939)
The at org. Apache. Spark. RDD. RDD $$anonfun $collect $1 $$anonfun $12. Apply (RDD. Scala: 939)
The at org. Apache. Spark. SparkContext $$$runJob anonfun $5. The apply (SparkContext. Scala: 2074)
The at org. Apache. Spark. SparkContext $$$runJob anonfun $5. The apply (SparkContext. Scala: 2074)
The at org. Apache. Spark. The scheduler. ResultTask. RunTask (ResultTask. Scala: 87)
At org. Apache. Spark. The scheduler. Task. Run (109) Task. Scala:
The at org. Apache. Spark. Executor. $TaskRunner executor. Run (executor. Scala: 345)
The at Java. Util. Concurrent. ThreadPoolExecutor. RunWorker (ThreadPoolExecutor. Java: 1149)
The at Java. Util. Concurrent. ThreadPoolExecutor $Worker. The run (ThreadPoolExecutor. Java: 624)
The at Java. Lang. Thread. The run (Thread. Java: 748)

CodePudding user response:

Because of the way is to use a distributed operation, so need to self-built bag package you zip - r../self_building. Zip. And then introduces the compression packaging good files for the spark, the goal is to make all the nodes are able to accept the file
Sc=SparkContext (master="yarn - cluster", appName="myApp")

Sc. AddPyFile (file_path)