Is there a way to add each result to a row of the dataframe?-CodePudding

I'm working on a method to annotate a text and currently building a function to add each text and its pos to a row on the dataframe.

Text: pos :

apple PROPN be AUX look VERB

import spacy
import pandas as pd

df = pd.DataFrame(columns = ['Text', 'pos'])

def annotate(text):
    nlp = spacy.load("en_core_web_sm")
    doc = nlp(text)

    for token in doc:
        print(token.text, token.pos_) 
        df = df.append({'Text' : 'token.text', 'pos' : 'token.pos_'},  ignore_index = True)

annotate('Apple is looking at buying U.K. startup for $1 billion')

CodePudding user response：

Try collecting the data, THEN creating the dataframe. In general that will run more efficiently than appending rows to an existing dataframe:

def annotate(text):
    nlp = spacy.load("en_core_web_sm")
    doc = nlp(text)

    rows = []
    for token in doc:
        print(token.text, token.pos_)
        rows.append([token.text, token.pos])
    df = pd.DataFrame(rows, columns=['Text', 'pos'])
    return df

then call it using:

df = annotate('Apple is looking at buying U.K. startup for $1 billion')