Home > Software engineering >  Importing Gensim/Word2Vec not stable in Databricks
Importing Gensim/Word2Vec not stable in Databricks

Time:12-30

I am simply to import import Word2Vec from gensim.models, but from few days I keep getting the following error :

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 80 from PyObject

I tried updating my Numpy package to 1.24.1 (the latest) and Gensim package 4.3.0, but I still getting the same issue.

For details, I work with Python version 3.8.10 in Databricks.

Any idea please? Thank you

CodePudding user response:

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 80 from PyObject

happens as a result of a modification made to the API of NumPy in version. Upgrading your numpy module version will fix the issue. few days ago, only ne numpy version 1.25.0 is released.

Try Uninstalling and reinstalling numpy in notebook or upgrade the version.

#uninstall install versions
%sh
pip uninstall numpy
pip install numpy

#upgrade version
pip install numpy --upgrade

Also try with different gensim version.

I am successfully able to install:

enter image description here

CodePudding user response:

Thank you @gojomo and @pratik-lad for your feedbacks.

The best solution that I find to solve the issue of "binary incompatibility" is to edit my Databricks cluster and upgrade Databricks runtime version from 10.1 to 12.0.

I am now able to install gensim, with those version:

Python 3.9.5
NumPy 1.21.5
SciPy 1.7.3
  • Related