I want to ask a seemingly simple question to Python wizs (I am a total newbie so have no idea how simple/complex this question is)!
I have a verb list in a dataframe looking as below:
id verb
15 believe
64 start
90 believe
I want to lemmatize it. The problem is that most lemmatization comes with sentence strings. My data does not provide context to decide its part-of-speech because I only need 'verb' speech lemmas.
Would you have any ideas about how to go about lemmatizing this verb list? Many thanks in advance for considering my question!
CodePudding user response:
If you are asking how to apply a function over a pandas DataFrame column, you can do
import pandas as pd
from nltk.stem import WordNetLemmatizer
data = pd.DataFrame({
"id": [1, 2, 3, 4],
"verb": ["believe", "start", "believed", "starting"],
})
# https://www.nltk.org/_modules/nltk/stem/wordnet.html
wnl = WordNetLemmatizer()
data.verb = data.verb.map(lambda word: wnl.lemmatize(word, pos="v"))
print(data)
Output
id verb
0 1 believe
1 2 start
2 3 believe
3 4 start