Home > database >  How to iterate over a list, modify it but avoid an error?
How to iterate over a list, modify it but avoid an error?

Time:10-29

I have two variable lenght lists extracted from an excel file. One has wagon number and the other the wagon weight, something like this:

wagon_list = [1234567, 2345678, 3456789, 4567890]
weight_list = [1.1, 2.2, 3.3, 4.4]

Sometimes the wagon_list will have a duplicate number, I need to sum the wagon weight and remove the duplicate from both:

wagon_list = [1234567, 2345678, 2345678, 4567890]
weight_list = [1.1, 2.2, 3.3, 4.4]

should become:

wagon_list = [1234567, 2345678, 4567890]
weight_list = [1.1, 5.5, 4.4]

My first option was to pop items and sum them while iterating with a for loop. It didnt work because (after some research) you cant change a list youre iterating over. So I moved to the second option, using an auxiliary list. It doesnt work when it hits the last index. Even after some tweaking of my code, I cant find a solution.

I can see it would have further problems if the last three elements were to be added.

counter_3 = 0

for i in wagon_list:

    if i == wagon_list[-1]: #last entry, simply appends to the new list. This comes first because the next option returns error if running the last entry as i
        new_wagon_list.append(wagon_list[counter_3])
        new_weight_list.append(weight_list[counter_3])
        counter_3  =2

    elif i != wagon_list[(counter_3   1)]: #if they are different, appends.
        new_wagon_list.append(wagon_list[counter_3])
        new_weight_list.append(weight_list[counter_3])
        counter_3  = 1

    elif i == wagon_list[(counter_3   1)]: #if equal to next item, appends the wagon and sums the weights
        new_wagon_list.append(wagon_list[counter_3])
        new_weight_list.append(weight_list[counter_3]   weight_list[counter_3   1])

This should return:

wagon_list = [1234567, 2345678, 4567890]
weight_list = [1.1, 5.5, 4.4]

But returns

wagon_list = [1234567, 2345678, 3456789, 3456789, 3456789]
weight_list = [1.1, 2.2, 7.7, 7.7, 3.3]

CodePudding user response:

Modifying a list that you're iterating over doesn't work out well. I'd zip the two lists together and use itertools.groupby:

>>> from itertools import groupby
>>> wagon_list = [1234567, 2345678, 2345678, 4567890]
>>> weight_list = [1.1, 2.2, 3.3, 4.4]
>>> wagon_list, weight_list = map(list, zip(*(
...     (wagon, sum(weight for _, weight in group))
...     for wagon, group in groupby(sorted(
...         zip(wagon_list, weight_list)
...     ), key=lambda t: t[0])
... )))
>>> wagon_list
[1234567, 2345678, 4567890]
>>> weight_list
[1.1, 5.5, 4.4]

CodePudding user response:

Use a dictionary to combine the values:

In [1]: wagon_list = [1234567, 2345678, 2345678, 4567890]
   ...: weight_list = [1.1, 2.2, 3.3, 4.4]
Out[1]: [1.1, 2.2, 3.3, 4.4]

In [2]: together = {}
Out[2]: {}

In [3]: for k, v in zip(wagon_list, weight_list):
   ...:     together[k] = together.setdefault(k, 0)   v
   ...:     

In [4]: together
Out[4]: {1234567: 1.1, 2345678: 5.5, 4567890: 4.4}

In [6]: new_wagon_list = list(together.keys())
Out[6]: [1234567, 2345678, 4567890]

In [7]: new_weight_list = list(together.values())
Out[7]: [1.1, 5.5, 4.4]

CodePudding user response:

Here is a simple way, using defaultdict (hence the result is correct even if wagon_list is unordered). You could also use groupby but then you have to sort both lists so that duplicate wagons are consecutive.

This solution requires a single pass through the lists, and doesn't change the order of the lists. It just removes duplicate wagons and adds their weight.

from collections import defaultdict

def group_weights(wagon_list, weight_list):
    ww = defaultdict(float)
    for wagon, weight in zip(wagon_list, weight_list):
        ww[wagon]  = weight

    return list(ww), list(ww.values())

Example

# set up MRE

wagon_list = [1234567, 2345678, 2345678, 4567890]
weight_list = [1.1, 2.2, 3.3, 4.4]

new_wagon_list, new_weight_list = group_weights(wagon_list, weight_list)

>>> new_wagon_list
[1234567, 2345678, 4567890]

>>> new_weight_list
[1.1, 5.5, 4.4]

Addendum

If you'd like to avoid defaultdict altogether, you can also simply do this (same result as above):

ww = {}
for k, v in zip(wagon_list, weight_list):
    ww[k] = ww.get(k, 0)   v
new_wagon_list, new_weight_list = map(list, zip(*(ww.items())))

CodePudding user response:

No fluff, frills, dependency or mystery version. Either an index for the current wagon is going to be found, allowing us to pinpoint the weight index to modify or no index is found and we append both of the new values.

Your entire problem revolves around "Does this already exist?". When using any Iterable, we can answer that question with index. index throws an Exception if no index is found so, we wrap it in try and treat except as an else.

def wagon_filter(wagons:list, weights:list) -> tuple:
    #pre-zip and clear so we can reuse the references
    data   = zip(wagons, weights)
    wagons, weights = [], []
    
    #reassign
    for W, w in data:
        try:      #(W)agon exists? modify it's (w)eight index
            i = wagons.index(W)
            weights[i]  = w
        except:   #else append new (W)agon and (w)eight
            wagons.append(W)
            weights.append(w)
     
    return wagons, weights

usage:

#data
wagons  = [1234567, 2345678, 2345678, 4567890]
weights = [1.1, 2.2, 3.3, 4.4]

#print filter results
print(*wagon_filter(wagons, weights), sep='\n')

#[1234567, 2345678, 4567890]
#[1.1, 5.5, 4.4]
  • Related