Home > database >  __init__() got an unexpected keyword argument 'cachedir' when importing top2vec
__init__() got an unexpected keyword argument 'cachedir' when importing top2vec

Time:09-24

I keep getting this error when importing top2vec.

TypeError                                 Traceback (most recent call last)
Cell In [1], line 1
----> 1 from top2vec import Top2Vec

File ~\AppData\Roaming\Python\Python39\site-packages\top2vec\__init__.py:1
----> 1 from top2vec.Top2Vec import Top2Vec
      3 __version__ = '1.0.27'

File ~\AppData\Roaming\Python\Python39\site-packages\top2vec\Top2Vec.py:12
     10 from gensim.models.phrases import Phrases
     11 import umap
---> 12 import hdbscan
     13 from wordcloud import WordCloud
     14 import matplotlib.pyplot as plt

File ~\AppData\Roaming\Python\Python39\site-packages\hdbscan\__init__.py:1
----> 1 from .hdbscan_ import HDBSCAN, hdbscan
      2 from .robust_single_linkage_ import RobustSingleLinkage, robust_single_linkage
      3 from .validity import validity_index

File ~\AppData\Roaming\Python\Python39\site-packages\hdbscan\hdbscan_.py:509
    494         row_indices = np.where(np.isfinite(matrix).sum(axis=1) == matrix.shape[1])[0]
    495     return row_indices
    498 def hdbscan(
    499     X,
    500     min_cluster_size=5,
    501     min_samples=None,
    502     alpha=1.0,
    503     cluster_selection_epsilon=0.0,
    504     max_cluster_size=0,
    505     metric="minkowski",
    506     p=2,
    507     leaf_size=40,
    508     algorithm="best",
--> 509     memory=Memory(cachedir=None, verbose=0),
    510     approx_min_span_tree=True,
    511     gen_min_span_tree=False,
    512     core_dist_n_jobs=4,
    513     cluster_selection_method="eom",
    514     allow_single_cluster=False,
    515     match_reference_implementation=False,
    516     **kwargs
    517 ):
    518     """Perform HDBSCAN clustering from a vector array or distance matrix.
    519 
    520    Parameters
   (...)
    672        Density-based Cluster Selection. arxiv preprint 1911.02282.
    673    """
    674     if min_samples is None:

TypeError: __init__() got an unexpected keyword argument 'cachedir'

Python version: 3.9.7 (64-bit)

Have installed MSBuild

No errors when pip installing this package

Does anyone know a solution to this problem or experienced a similar problem?

CodePudding user response:

It looks like you are using latest versions of hdbscan and joblib packages available on PyPI.

cachedir was removed from joblib.Memory some 8 months ago as depreciated. The latest version on PyPi is 1.2.0 from Sep 16, 2022, i.e. it incorporate this change

hdbscan source code on GitHub was last updated like 7 days ago. Unfortunately the latest hdbscan release on PyPi is ver. 0.8.28 as of Feb 8, 2022 and still not updated. It still use memory=Memory(cachedir=None, verbose=0),

One possible solution is to force using joblib version before cachedir was removed - ver. 1.1.0 as of Oct 7, 2021

  • Related