I need to iterate over a dataframe. In each iteration row.Text is converted into a vector-representation and stored as a numpy.ndarray (newData). Now i want to add a column (Vektoren) to the original dataframe and apply to each row the newData array
for idx,row in data.iterrows():
doc = nlp(row.Text)
newData =doc.vector
data.loc[idx,'Vektoren'] = newData
Unfortunatly i cant get it to work. what would be a better way instead of using iterrows?
I got it to work with a list:
vectorList = []
for idx,row in data.iterrows():
doc = nlp(row.Text)
newData =doc.vector
vectorList.append(newData)
data['Vektoren'] = pd.Series(vectorList)
I am still wondering if there is a more elegant solution
CodePudding user response:
Make your solution concise with map
data['Vektoren'] = data['Text'].map(lambda s: nlp(s).vector)