Home > Back-end >  python: sum values in a list if they share the first word
python: sum values in a list if they share the first word

Time:03-16

I have a list as follows,

flat_list = ['hello,5', 'mellow,4', 'mellow,2', 'yellow,2', 'yellow,7', 'hello,7', 'mellow,7', 'hello,7']

I would like to get the sum of the values if they share the same word, so the output should be,

desired output:

l = [('hello',19), ('yellow', 9), ('mellow',13)]

so far, I have tried the following,

new_list = [v.split(',') for v in flat_list]

d = {}
for key, value in new_list:
   if key not in d.keys():
      d[key] = [key]
   d[key].append(value)

# getting rid of the first key in value lists
val = [val.pop(0) for k,val in d.items()]
# summing up the values
va = [sum([int(x) for x in va]) for ka,va in d.items()]

however for some reason the last sum up does not work and i do not get my desired output

CodePudding user response:

Here is a variant for accomplishing your goal using defaultdict:

from collections import defaultdict

t = ['hello,5', 'mellow,4', 'mellow,2', 'yellow,2',
     'yellow,7', 'hello,7', 'mellow,7', 'hello,7']

count = defaultdict(int)

for name_number in t:
    name, number = name_number.split(",")
    count[name]  = int(number)

You could also use Counter:

from collections import Counter

count = Counter()

for name_number in t:
    name, number = name_number.split(",")
    count[name]  = int(number)

In both cases you can convert the output to a list of tuples using:

list(count.items())
# -> [('hello', 19), ('mellow', 13), ('yellow', 9)]

I ran your code and I do get the correct results (although not in your desired format).

CodePudding user response:

You can do this very simply without importing additional modules like so:

t = ['hello,5', 'mellow,4', 'mellow,2', 'yellow,2', 'yellow,7', 'hello,7', 'mellow,7', 'hello,7']

d = {}
for s in t: #for each string
    w, n = s.split(',') #get the string and the number
    d[w] = d[w]   int(n) if w in d.keys() else int(n) #add the number (sum)

l = list(d.items()) #make the result a list of tuples
print(output)

Output:

[('hello', 19), ('mellow', 13), ('yellow', 9)]

CodePudding user response:

One possible approach would be:

import pandas as pd
    
flat_list = ['hello,5', 'mellow,4', 'mellow,2', 'yellow,2', 'yellow,7', 'hello,7', 'mellow,7', 'hello,7']
new_list = [v.split(',') for v in flat_list]
    
df = pd.DataFrame(new_list)
df[1] = df[1].astype(int)
df2 = df.groupby(0).sum()
print(df2)

Output:

    0        1
    hello   19
    mellow  13
    yellow   9

CodePudding user response:

for some reason the last sum up does not work

To fix your original solution:

  • If the key is not present in the dictionary add it using d[key] = [value] instead of d[key] = [key]. Then you don't need val = [val.pop(0) for k,val in d.items()]

  • If the key is in the dictionary then append the value in an else clause.

    for key, value in new_list:
        if key not in d:
            d[key] = [value]
        else:
            d[key].append(value)
    
  • Related