I'm trying to take a list of list and then add it to pandas to sum up by one value.
My list of list:
[['she', 'walked', 4],
['she', 'my', 3],
['she', 'dog', 2],
['she', 'to', 1],
['sniffed', 'I', 5],
['sniffed', 'walked', 4],
['sniffed', 'my', 3],
['sniffed', 'dog', 2],
['sniffed', 'to', 1]]
I create the dataframe:
import pandas as pd
df = pd.DataFrame(distanceList, columns = ['word1', 'word2', 'weight'])
the result looks weird(it has the extra index column for some reason):
word1 word2 weight
0 I walked 5
1 I my 4
2 I dog 3
3 I to 2
4 I the 1
... ... ... ...
1135 I walked 5
1136 I my 4
1137 I dog 3
1138 I to 2
1139 I the 1
1140 rows × 3 columns
but when I sum it, seems to combine the words. I used this:
df.groupby('weight').sum()
word1 word2
weight
1 Iwalkedmydogtotheparkandshesniffedgrassthenrol... thethethethethetotototototototototototototothe...
2 Iwalkedmydogtotheparkandshesniffedgrassthenrol... totototodogdogdogdogdogdogdogdogdogdogdogdogdo...
3 Iwalkedmydogtotheparkandshesniffedgrassthenrol... dogdogdogmymymymymymymymymymymymymymymymydogdo...
4 Iwalkedmydogtotheparkandshesniffedgrassthenrol... mymywalkedwalkedwalkedwalkedwalkedwalkedwalked...
5 Iwalkedmydogtotheparkandshesniffedgrassthenrol... walkedIIIIIIIIIIIIIIIIIIwalkedIIIIIIIIIIIIIIII...
What I want is if I have:
dog, cat, 1
dog, cat, 5
dog, rabbit, 1
then the result is:
dog, cat, 6
dog, rabbit, 1
CodePudding user response:
The code you want is as follows.
df.groupby('word1')['weight'].sum()
The code calculates sum of weight
according to the word1
.
Your code calculates sum of word1
and word2
according to the weight
, and sum of strings are concat of strings. That is why the string is concat (e.g, Iwalkedmydogtotheparkandshesniffedgrassthenrol
)
Edit I am confusing with the example data. You should try the following code.
df.groupby(['word1', 'word2'], as_index = False)['weight'].sum()
The result as follows.
word1 word2 weight
0 dog cat 6
1 dog rabbit 1