Home > Blockchain >  Sentences without a pos - Python
Sentences without a pos - Python

Time:04-04

I have tokenized the text and want to print error for the sentences without a pos but it prints error for every single sentence. How should I change it?

sents = nltk.sent_tokenize(text)

for sent in sents:
    tokens = nltk.word_tokenize(sent)
    tagged = nltk.pos_tag(tokens)
    
    for pos in tagged:        
        if 'VB' not in sents :
             print('error')
         
         

CodePudding user response:

text = "this sentence has verb. this one not"
sents = nltk.sent_tokenize(text)

for sent in sents:
    has_verb = False
    tokens = nltk.word_tokenize(sent)
    pos_tags = nltk.pos_tag(tokens)
    for pos_tag in pos_tags:        
        if 'VB' in pos_tag[1] :
            has_verb=True
            break
    if not has_verb:
        print(f'error: "{sent}" does not have verb')

CodePudding user response:

As @baileythegreen pointed out, thought your last condition is only after tagging the sents, if 'VB' not in sents: is checking the entire tokenized text. which returns true for every iteration even if the sent you're iterating over has a 'VB' tag in it. you probably should use a flag E.g. has_VB = False and the condition should beif 'VB' not in tagged[1]: has_VB = True else: if has_VB: print(error)

  • Related