import os
import torch
import numpy as np
import random
import spacy
from bpemb import BPEmb
nlp = spacy.load("en_core_web_sm")
tokenizer = nlp.Defaults.create_tokenizer(nlp)
This is my code and whenever I try to run this an error shows up saying
AttributeError: type object 'EnglishDefaults' has no attribute 'create_tokenizer'
CodePudding user response:
Have you considered using the build-in Class Tokenizer
that, according to the documentation, we can use to create new tokenizer?
import spacy
from spacy.tokenizer import Tokenizer
nlp = spacy.load("en_core_web_sm")
tokenizer = Tokenizer(nlp.vocab)
print(tokenizer)
result:
$ python3 main.py
<spacy.tokenizer.Tokenizer object at 0x13e7a52d0>