I am trying to assign values to a column in my pandas df, however I am getting a blank column, here's the code:
df['Bag_of_words'] = ''
columns = ['genre', 'director', 'actors', 'key_words']
for index, row in df.iterrows():
words = ''
for col in columns:
words = ' '.join(row[col]) ' '
row['Bag_of_words'] =words
The output is an empty column, can someone please help me understand what is happening here, as I am not getting any errors.
CodePudding user response:
Instead of:
row['Bag_of_words'] =words
Use:
df.at[index,'Bag_of_words'] = words
CodePudding user response:
from the iterrows
documentation:
- You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect.
So you do row[...] = ...
and it turns out row
is a copy and that's not affecting the original rows.
iterrows
is frowned upon anyway, so you can instead
join each words list per row to become strings
aggregate those strings with
" ".join
row-wiseadd space to them
df["Bag_of_words"] = (df[columns].apply(lambda col: col.str.join(" "))
.agg(" ".join, axis="columns")
.add(" "))