Home > Mobile >  How do you detect a key word in a sentence no matter the tense, form in python?
How do you detect a key word in a sentence no matter the tense, form in python?

Time:12-24

I am trying to use spaCy in Python to detect the word "grief" no matter the form, whether it is "I am grieving", "going through grief.""I grieved over __", if it's in all caps, etc. I am pretty new to python so I don't know lemmatization that well, but is there some simple if statements that could solve it using spaCy?

grief = str(input(("What is currently on your mind? ")))
doc = nlp(grief)
if [t.grief for t in doc if t.lemma_ == "grie"]:
    grief1(sad_value)

CodePudding user response:

There are two lemmas you need to check for: "grief" and "grieve". Here is a solution that makes use of the spaCy lemmatiser:

import spacy

nlp = spacy.load('en_core_web_sm', exclude=["ner"])
grief = str(input(("What is currently on your mind? ")))
# Input: "I am grieving"
doc = nlp(grief)
for t in doc:
    if t.lemma_ == "grief" or t.lemma_ == "grieve":
        print("Found {}".format(t.lemma_))
# Output: "Found grieve"

Examples for testing

import spacy

nlp = spacy.load('en_core_web_sm', exclude=["ner"])
texts = ["I am grieving", "Going through grief", "I will grieve", "I grieved", "He grieves"]
docs = list(nlp.pipe(texts))
for doc in docs:
    print(doc.text)
    for t in doc:
        if t.lemma_ == "grief" or t.lemma_ == "grieve":
            print("\t-> Found {}".format(t.lemma_))

# Output
# I am grieving
#         -> Found grieve
# Going through grief
#         -> Found grief
# I will grieve
#         -> Found grieve
# I grieved
#         -> Found grieve
# He grieves
#         -> Found grieve

Alternatively, you can also use Stemming via the SnowballStemmer implementation:

from nltk.stem.snowball import SnowballStemmer
from nltk.tokenize import word_tokenize

stemmer = SnowballStemmer(language='english')
grief = str(input(("What is currently on your mind? ")))
for token in word_tokenize(grief):
    stem = stemmer.stem(token)
    if stem == 'grief' or stem == 'griev':
        print("Found {}".format(stem))
  • Related