I have annotated articles in a list (len=488), and I want to apply the .lower()
method on the lemmas. I get the following error message AttributeError: 'NoneType' object has no attribute 'lower'
. Here's the code:
file = open("Guardian_Syria_text.csv", mode="r", encoding='utf-8-sig')
data = list(csv.reader(file, delimiter=","))
file.close
pickle.dump(data, open('List.p', 'wb'))
stanza.download('en')
nlp = stanza.Pipeline(lang='en',
processors='tokenize,lemma,POS',
use_gpu=True)
data_list = pickle.load(open('List.p', 'rb'))
new_list = []
for article in data_list:
a = nlp(str(article))
new_list.append(a)
pickle.dump(new_list, open('Annotated.p', 'wb'))
annot_data = pickle.load(open('Annotated.p', 'rb'))
pos_tags = {'NOUN', 'VERB', 'ADJ', 'ADV', 'X'}
lemmas = []
for article in annot_data:
art_tokens = [w.text for s in article.sentences for w in s.words]
art_lemmas = [w.lemma.lower() for s in article.sentences for w in s.words
if w.upos in pos_tags]
lemmas.append(art_lemmas)
I searched the variable annot_data
for None
(print(annot_data is None)
), but it returned False
.
I tried cleaning the variable like so clean = [x for x in annot_data if x != None]
, but the length of the variable clean
is the same as the old one (488), and the code gives me same error message using the new clean
variable instead of the old annot_data
one.
Where's the supposed NoneType and how can I avoid it?
CodePudding user response:
The error refers to w.lemma.lower()
, so the problem is that w.lemma
is None
, not that article
is None
.
You can check for this in the list comprehension.
art_lemmas = [w.lemma.lower() for s in article.sentences for w in s.words
if w.lemma is not None and w.upos in pos_tags]