With list of tuples corresponding to a list of int values, create list corresponding to sum of each-CodePudding

I have a huge list of sublists, each sublist consisting of a tuple and a list of 4 integers.

I want to create a list of unique tuples that adds each integer values of the list (keeping the four integers in the list separate).

Short Example:

[[(30, 40), [4, 7, 7, 1]],[(30, 40), [2, 9, 3, 4]],[(30, 40), [6, 5, 10, 0]],[(20, 40), [4, 0, 4, 0]],[(20, 40), [3, 4, 14, 5]],[(20, 40), [3, 2, 12, 0]],[(10, 40), [223, 22, 12, 9]]]

Output wanted:

[[(30, 40), [12, 21, 20, 5]],[(20, 40), [2, 9, 3, 4]],[(10, 40), [223, 22, 12, 9]]

I have tried using a dictionary

l = [[(30, 40), [4, 7, 7, 1]],[(30, 40), [2, 9, 3, 4]],[(30, 40), [6, 5, 10, 0]],[(20, 40), [4, 0, 4, 0]],[(20, 40), [3, 4, 14, 5]],[(20, 40), [3, 2, 12, 0]],[(10, 40), [223, 22, 12, 9]]]

dict_tuples = {}
for item in l:
    if item[0] in dict_tuples:
        dict_tuples[item[0]]  = item[1]
    else:
        dict_tuples[item[0]] = item[1]

But here I am just getting a long list of integer values for each tuple. I want to sum of each index in the list of four integers.

CodePudding user response：

You can create a dictionary where keys are the first tuples and values are lists of sublists. In second step sum the values at each index:

lst = [
    [(30, 40), [4, 7, 7, 1]],
    [(30, 40), [2, 9, 3, 4]],
    [(30, 40), [6, 5, 10, 0]],
    [(20, 40), [4, 0, 4, 0]],
    [(20, 40), [3, 4, 14, 5]],
    [(20, 40), [3, 2, 12, 0]],
    [(10, 40), [223, 22, 12, 9]],
]

out = {}
for t, l in lst:
    out.setdefault(t, []).append(l)

out = [[k, [sum(t) for t in zip(*v)]] for k, v in out.items()]

print(out)

Prints:

[
    [(30, 40), [12, 21, 20, 5]],
    [(20, 40), [10, 6, 30, 5]],
    [(10, 40), [223, 22, 12, 9]],
]

CodePudding user response：

itertools.groupby makes this trivial. This could be done in one go, but for the sake of seeing each step of the transformation:

from itertools import groupby
from operator import itemgetter

l = [[(30, 40), [4, 7, 7, 1]], [(30, 40), [2, 9, 3, 4]], [(30, 40), [6, 5, 10, 0]], [(20, 40), [4, 0, 4, 0]], [(20, 40), [3, 4, 14, 5]], [(20, 40), [3, 2, 12, 0]], [(10, 40), [223, 22, 12, 9]]]

s = sorted(l, key=itemgetter(0))
# [[(10, 40), [223, 22, 12, 9]], [(20, 40), [4, 0, 4, 0]], [(20, 40), [3, 4, 14, 5]], [(20, 40), [3, 2, 12, 0]], [(30, 40), [4, 7, 7, 1]], [(30, 40), [2, 9, 3, 4]], [(30, 40), [6, 5, 10, 0]]]

g = groupby(s, key=itemgetter(0))

l2 = [(k, [x[1] for x in v]) for k, v in g]
# [((10, 40), [[223, 22, 12, 9]]), ((20, 40), [[4, 0, 4, 0], [3, 4, 14, 5], [3, 2, 12, 0]]), ((30, 40), [[4, 7, 7, 1], [2, 9, 3, 4], [6, 5, 10, 0]])]

l3 = [(k, list(zip(*v))) for k, v in l2]
# [((10, 40), [(223,), (22,), (12,), (9,)]), ((20, 40), [(4, 3, 3), (0, 4, 2), (4, 14, 12), (0, 5, 0)]), ((30, 40), [(4, 2, 6), (7, 9, 5), (7, 3, 10), (1, 4, 0)])]

l4 = [(k, [sum(x) for x in v]) for k, v in l3]
# [((10, 40), [223, 22, 12, 9]), ((20, 40), [10, 6, 30, 5]), ((30, 40), [12, 21, 20, 5])]