I am having dataframe idf
as below. I have another Dataframe df
idf
Output-
feature_name idf_weights
2488 kralendijk 11.221923
3059 night 11.221923
1383 ebebf 11.221923
df
Output-
message Number of Words in each message
0 night kralendijk ebebf 3
I want to add 'idf weights' from idf dataframe for each word in "df" dataframe in new column.
Output will look like below-
df
Output-
message Number of Words in each message IDF Score
0 night kralendijk ebebf 3 33.665769
I tried summing up in below code but it's not working.
Code-
df["Total_IDF Score"] = idf['idf_weights'].sum(axis=0)
Thank you.
CodePudding user response:
One way to do it is to convert idf to a dictionary first, like that:
words_weights = dict(idf[['feature_name', 'idf_weights']].values)
{'kralendijk': 11.221923, 'night': 11.221923, 'ebebf': 11.221923}
then just split the message col, get corresponding value for each word from dict and sum it up:
df['score'] = df['message'].apply(lambda x: sum([words_weights.get(word, 0) for word in x.split()]))