I get error when running deberta in the R-package text, when running:
textEmbed(“hello”, model = “microsoft/deberta-v3-base”)
error:
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: This tokenizer cannot be instantiated. Please make sure you have `sentencepiece` installed in order to use this tokenizer.
CodePudding user response:
So to get this to work you need to install sentencepiece
in your conda environment. (And when I did that I had some problems that RStudio was freezing for me – so after updating RStudio and R, I created a specific conda environment with scipy 1.6
and sentencepiece
, and then it works without any problems:
text::textrpp_install(rpp_version=c("torch==1.8", "transformers==4.12.5",
"numpy", "nltk",
"scipy==1.6", "sentencepiece"),
envname = "textrpp_condaenv_sentencepiece")
text::textrpp_initialize(condaenv = "textrpp_condaenv_sentencepiece",
refresh_settings = TRUE)