lemmatizing a verb list in a data frame in Python-CodePudding

I want to ask a seemingly simple question to Python wizs (I am a total newbie so have no idea how simple/complex this question is)!

I have a verb list in a dataframe looking as below:

id verb
15 believe
64 start
90 believe

I want to lemmatize it. The problem is that most lemmatization comes with sentence strings. My data does not provide context to decide its part-of-speech because I only need 'verb' speech lemmas.

Would you have any ideas about how to go about lemmatizing this verb list? Many thanks in advance for considering my question!

CodePudding user response：

If you are asking how to apply a function over a pandas DataFrame column, you can do

import pandas as pd
from nltk.stem import WordNetLemmatizer


data = pd.DataFrame({
    "id": [1, 2, 3, 4],
    "verb": ["believe", "start", "believed", "starting"],
})
# https://www.nltk.org/_modules/nltk/stem/wordnet.html
wnl = WordNetLemmatizer()
data.verb = data.verb.map(lambda word: wnl.lemmatize(word, pos="v"))

print(data)

Output

   id     verb
0   1  believe
1   2    start
2   3  believe
3   4    start