Adding values in a list if they are the same, with a twist-CodePudding

I have a question (python) regarding adding values in a list if they share the same key in a first list. So for example we have:

lst1 = [A, A, B, A, C, D]
lst2 = [1, 2, 3, 4, 5, 6]

What I would like to know is how I can add the numbers in lst2 if the strings in lst1 are the sane. The end result would thus be:

new_lst1 = [A, B, A, C, D]
new_lst2 = [3, 3, 4, 5, 6]

where

new_lst2[0] = 1 2

So values only get added when the are next to each other.

To make it more complicated, it should also be possible if we have this example:

lst3 = [A, A, A, B, B, A, A]
lst4 = [1, 2, 3, 4, 5, 6, 7]

for which the result has to be:

new_lst3 = [A, B, A]
new_lst4 = [6, 9, 13]

where new_lst4[0] = 1 2 3, new_lst4[1] = 4 5, and new_lst4[2] = 6 7.

Thank you in advance!

for a bit of background: I wrote a code that searches in Dutch online underground models and returns the data of the underground of a specific input location.

The data is made up of layers:
Layer1, Name: "BXz1", top_layer, bottom_layer, transitivity
Layer2, Name: "BXz2", top_layer, bottom_layer, transitivity
Layer3, Name: "KRz1", top_layer, bottom_layer, transitivity

etc..

BXz1 and BXz2 are the same main layer however different sublayers. In terms of transitivity I would like to combine them if they are next to each other. so in that way i would get:
Layer1 2, Name: BX, top_layer1, bottom_layer2, combined transitivity Layer3, Name: "KRz1", top_layer, bottom_layer, transitivity

CodePudding user response：

The itertools.groupby function in the standard library provides the base functionality you need. It's then a matter of giving it the right key and tallying the count in each group.

Here's my implementation:

from itertools import groupby

def tally_by_group(keys, counts):
    groups = groupby(zip(keys, counts), key=lambda x: x[0])
    tallies = [
        (key, sum(count for _, count in group))
        for key, group in groups
    ]
    return tuple(list(l) for l in zip(*tallies))

Code explanation:

my first zip() creates (key, count) tuples from the two lists,
the groupby groups them by the first element of each tuple, i.e., by the key,
then I construct a list of (key, sum(count)) into tallies,
and finally unpack that back into two lists for the results.

Tests with your examples:

lst1 = ["A", "A", "B", "A", "C", "D"]
lst2 = [1, 2, 3, 4, 5, 6]

l1m, l2m = tally_by_group(lst1, lst2)
print(l1m)
print(l2m)

outputs:

['A', 'B', 'A', 'C', 'D']
[3, 3, 4, 5, 6]

And

lst3 = ["A", "A", "A", "B", "B", "A", "A"]
lst4 = [1, 2, 3, 4, 5, 6, 7]

l3m, l4m = tally_by_group(lst3, lst4)
print(l3m)
print(l4m)

outputs:

['A', 'B', 'A']
[6, 9, 13]

CodePudding user response：

If you're not allowed to use libraries, you could do it with a simple loop using zip() to pair up the keys and values.

lst3 = ["A", "A", "A", "B", "B", "A", "A"]
lst4 = [1, 2, 3, 4, 5, 6, 7]

new_lst3,new_lst4 = lst3[:1],[0] # initialize with first key
for k,n in zip(lst3,lst4):       # pair up keys and numbers
    if new_lst3[-1] != k:        # add new items if key changed
        new_lst3.append(k)
        new_lst4.append(0)
    new_lst4[-1]  = n            # tally for current key
    
print(new_lst3) # ['A', 'B', 'A']
print(new_lst4) # [6, 9, 13]

If you're okay with libraries, groupby from itertools combined with an iterator on the keys will allow you to express it more concisely:

from itertools import groupby

tally = ((k,sum(n)) for i3 in [iter(lst3)] 
         for k,n in groupby(lst4,lambda _:next(i3)))
new_lst3,new_lst4 = map(list,zip(*tally))