Home > Net >  DataFrame apply/append a function that returns a dict to each row
DataFrame apply/append a function that returns a dict to each row

Time:04-13

I'm looking to apply get_sentiment to each row in a dataframe and have the returned dict append to that row. Is there a good way of doing this?

def get_sentiment(txt: str) -> dict:
    response = client.detect_sentiment(Text=txt, LanguageCode='en')

    sentiment_data = dict()
    sentiment_data['Sentiment'] = response['Sentiment']
    sentiment_data['Sentiment_Score_Positive'] = response['SentimentScore']['Positive']
    sentiment_data['Sentiment_Score_Neutral'] = response['SentimentScore']['Neutral']
    sentiment_data['Sentiment_Score_Negative'] = response['SentimentScore']['Negative']
    return sentiment_data


def analyze_txt(df: DataFrame):
    df[] = df['Text'].apply(get_sentiment) #<- what I'm trying to do

Basically want the df to go from

id Text
1 hello world
2 this is something here

to

id Text Sentiment Sentiment_Score_Positive Sentiment_Score_Neutral Sentiment_Score_Negative
1 hello world Neutral .5 .5 .5
2 this is something here Neutral .5 .5 .5

CodePudding user response:

When you apply get_sentiment to the Text column, it returns a Series of dicts, so one way to get the desired output is to convert it to a list of dicts and construct a DataFrame with it; then join it to df:

new_df = df.join(pd.DataFrame(df['Text'].apply(get_sentiment).tolist()))

If df has a specific index that needs to be retained, you could assign it when constructing the DataFrame to be joined:

s = df['Text'].apply(get_sentiment)
new_df = df.join(pd.DataFrame(s.tolist(), index=s.index))

A faster method maybe to simply map get_sentiment to the Text column values:

new_df = df.join(pd.DataFrame(map(get_sentiment, df['Text'].tolist())))

CodePudding user response:

pd.concat looks like a viable option, too. To turn a list of dictionaries (or a list of lists that represent rows) into a dataframe, you can use pd.DataFrame.from_records.

df2 = pd.concat([df, pd.DataFrame.from_records(df.Text.apply(get_sentiment))], axis=1)
  • Related