We have the following dummy dataframe that scrapes many messages:
temp = pd.DataFrame(np.array([['I am feeling very well',],['It is hard to believe this happened',],
['What is love?',], ['Amazing day today',]]),
columns = ['message',])
Output:
message
0 I hate the weather today
1 It is hard to believe this happened
2 What is love
3 Amazing day today
I iterate through each individual message in order to extract the sentiment from them
for i in temp.message:
x = model.predict(i, 'roberta')
where x is a dictionary of the form:
x = {
"Love" : 0.0931,
"Hate" : 0.9169,
}
How can I add all of the values in the dictionary to the data frame while iterating through each?
for i in temp.message:
x = model.predict(i, 'roberta')
y = pd.DataFrame.from_dict(x,orient='index')
y = y.T
# what would the next step be?
Maybe creating the columns with null values and then creating a left join on every iteration on the message column would be a plausible solution? What would be most optimal?
Expected output:
message Love Hate
0 I hate the weather today 0.0931 0.9169
1 It is hard to believe this happened 0.444 0.556
...
CodePudding user response:
Don't try to assign while looping, collect in a list and assign/join in the end:
df = temp.join(pd.json_normalize([model.predict(i, 'roberta')
for i in temp.message]))
# OR
df = temp.join(pd.DataFrame([model.predict(i, 'roberta')
for i in temp.message]))
Example:
message Love Hate
0 I am feeling very well 0.0931 0.9169
1 It is hard to believe this happened 0.0931 0.9169
2 What is love? 0.0931 0.9169
3 Amazing day today 0.0931 0.9169
CodePudding user response:
You can create columns with np.NaN
values initially and update when necessary
temp = pd.DataFrame(np.array([['I am feeling very well',],['It is hard to believe this happened',],
['What is love?',], ['Amazing day today',]]),
columns = ['message',])
temp['Love'] = np.nan
temp['Hate'] = np.nan
Then update the values in the loop -
for i, message in enumerate(temp.message):
x = model. Predictt(message, 'Roberta')
temp.loc[index].Love = x["Love"]
temp.loc[index].Hate= x["Hate"]