Home > Mobile >  PySpark import error in PyCharm, modules not installed, "no module named "error
PySpark import error in PyCharm, modules not installed, "no module named "error

Time:05-29

The Problem

When I want to import pyspark in a python script in PyCharm I get below error (cannot import x from y). I checked the directory and the module that should be imported is not present.

Thats all there is in pyspark\cloudpickle\ enter image description here

Why is it not installed? What could be possible problems?

What I tried

Compatibility issues? I found this which looks similar, but my error says "cannot import name"

I also found this about cloudpickle specifically, I tried with cloudpickle=1.1.1 but it didn't work for me.

I also made a new env, re-installed pyspark and rebooted, but it didn't help.

import findspark
findspark.init()

Works without error.

Obviously I'm new to Spark/PySpark and might miss the obvious...

Error

import pyspark

    Traceback (most recent call last):
  File "C:\Users\me\anaconda3\envs\myenv\lib\site-packages\IPython\core\interactiveshell.py", line 3397, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-9-d008122bb79d>", line 3, in <cell line: 3>
    from pyspark.sql import Row
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2021.3.1\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\me\anaconda3\envs\myenv\lib\site-packages\pyspark\__init__.py", line 51, in <module>
    from pyspark.context import SparkContext
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2021.3.1\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\me\anaconda3\envs\myenv\lib\site-packages\pyspark\context.py", line 33, in <module>
    from pyspark.broadcast import Broadcast, BroadcastPickleRegistry
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2021.3.1\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\me\anaconda3\envs\myenv\lib\site-packages\pyspark\broadcast.py", line 25, in <module>
    from pyspark.cloudpickle import print_exec

ImportError: cannot import name 'print_exec' from 'pyspark.cloudpickle' (C:\Users\me\anaconda3\envs\myenv\lib\site-packages\pyspark\cloudpickle\__init__.py)

Specs

I am working in PyCharm IDE (PyCharm Community Edition 2021.3.1)

Python 3.10.4 | packaged by conda-forge | (main, Mar 30 2022, 08:38:02) [MSC v.1916 64 bit (AMD64)]

>conda list | grep pyspark

pyspark 3.2.1

>conda info

conda version : 4.12.0

conda-build version : 3.20.5

python version : 3.8.5.final.0

CodePudding user response:

Check to see if you have python lib in your path

CodePudding user response:

Since the files were not there, I just downloaded pyspark manually from the website and replaced the previos pyspark installation with the newly downloaded one.

This got rid of all the import errors.

  • Related