i have a Term-Frequency matrix saved as a pandas dataframe.
1000 Merkwürdig Mindestens Error ... Periode bildet 30 Button
0 0 0 0 0 ... 0 0 0 0
1 0 1 0 2 ... 0 0 0 0
2 0 0 0 0 ... 0 0 0 0
3 0 0 0 0 ... 0 0 0 0
4 0 0 1 0 ... 0 0 1 0
.. ... ... ... ... ... ... ... .. ...
121 0 0 0 0 ... 0 0 0 1
122 0 0 0 0 ... 0 0 0 0
123 0 0 0 0 ... 0 0 0 0
124 0 0 0 0 ... 0 0 0 0
For each row i want to count the word occurence, add a column called 'count' at the end, and save the wordcount for each row.
1000 Merkwürdig Mindestens Error ... Periode bildet 30 Button count
0 0 0 0 0 ... 0 0 0 0 0
1 0 1 0 2 ... 0 0 0 0 3
2 0 0 0 0 ... 0 0 0 0 0
Iterating over each row and column is probably not the best soution, so could this be vectorized?
CodePudding user response:
You can use .sum
method
df['count'] = df.sum(axis=1)
CodePudding user response:
Pandas has a sum function which will do what you need pd.sum()
. You will need to set the axis=1 to tell it to sum across rows instead of columns. See below:
df['count'] = df.sum(axis=1)