Home > Mobile >  How can I find the sum of only certain values in a list of lists AND only when a certain string is i
How can I find the sum of only certain values in a list of lists AND only when a certain string is i

Time:11-19

I think I didn't express myself very well in the title, but basically here's what I need to do. I have a veeeery big list of lists containing at index 1 the name of the comic, index 2 the unit price, index 3 the quantity sold and index 4 the total paid.

[['1', 'Tintin', '9.95', '3', '29.85'], ['2', 'Asterix', '12.5', '3', '37.5'], ['3', 'Asterix', '12.5', '3', '37.5'], ['4', 'Asterix', '12.5', '2', '25']

And I need to find the sum of the units sold and the total money paid. For example, here Asterix would be:

['Asterix', 12.5, 8, 100]

Any ideas?

CodePudding user response:

There are already good answers, just to post an alternative for larger cases, you can consider using pandas as well:

import pandas as pd

purchase_list = [['1', 'Tintin', '9.95', '3', '29.85'], ['2', 'Asterix', '12.5', '3', '37.5'], ['3', 'Asterix', '12.5', '3', '37.5'], ['4', 'Asterix', '12.5', '2', '25']]

purchase_list = [(int(pid), name, float(price), int(count), float(total)) for pid, name, price, count, total in purchase_list]

df = pd.DataFrame.from_records(my_list, columns = ['id', 'Name', 'price', 'count', 'total'])

So now the data is converted into a pandas.DataFrame.

   id     Name  price  count  total
0   1   Tintin   9.95    3.0  29.85
1   2  Asterix  12.50    3.0  37.50
2   3  Asterix  12.50    3.0  37.50
3   4  Asterix  12.50    2.0  25.00

In order to group the rows, we can define what functions we want to use for grouping and then use that criteria for merging:

d = {'price': 'first', 'Name': 'first', 'count': 'sum' ,'total': 'sum'}
df_grouped = df.groupby('Name').aggregate(d)

Output:

         price     Name  count   total
Name                                  
Asterix  12.50  Asterix    8.0  100.00
Tintin    9.95   Tintin    3.0   29.85

CodePudding user response:

data  = [['1', 'Tintin', '9.95', '3', '29.85'], ['2', 'Asterix', '12.5', '3', 
        '37.5'], ['3', 'Asterix', '12.5', '3', '37.5'], ['4', 'Asterix', '12.5', 
        '2', '25']]
store = {}

for i in data:
    if i[1] not in store:
        store[i[1]] = ['',0,0,0]
    store[i[1]][0] = i[1]
    store[i[1]][1] = i[2]
    store[i[1]][2]  = float(i[3])
    store[i[1]][3]  = float(i[3])
print(list(store.values()))

CodePudding user response:

Make a function that takes in a given name and some data and iterates over it performing the logic you described. Just make sure to coerce to proper number types prior to performing addition.

def stats(comic_data, name):
    unit_price = None
    num_sold = 0
    revenue = 0
    for comic in comic_data:
        _, cname, unit_p, num, amnt = comic
        if cname == name:
           if unit_price is None:
               unit_price = float(unit_p)
           num_sold  = int(num)
           revenue  = float(amnt)
    return [name, unit_price, num_sold, round(revenue, 2)]

stats(data, "Asterix")
>> ['Asterix', 12.5, 8, 100.0]

CodePudding user response:

Sounds like what you actually have are tuples, and what they actually are is purchase orders (or perhaps invoices?) Model those, then.

from dataclasses import dataclass
# for nice translations between the string price and the int price
from decimal import Decimal

@dataclass
class PurchaseOrder:
    name: str
    price: int  # in cents
    quantity: int
    total_price: int  # should be price * quantity

    @classmethod
    def from_tuple(cls, tup):
        _, name, price, quantity, total_price = tup
        price = int(Decimal(price) * 100)
        quantity = int(quantity)
        total_price = int(Decimal(total_price) * 100)
        
        return cls(name, price, quantity, total_price)

raw_purchase_orders = [['1', 'Tintin', '9.95', '3', '29.85'], ['2', 'Asterix', '12.5', '3', '37.5'], ['3', 'Asterix', '12.5', '3', '37.5'], ['4', 'Asterix', '12.5', '2', '25']]

# Populate purchase orders
purchase_orders = []
for raw_po in raw_purchase_orders:
    try:
        po = PurchaseOrder.from_tuple(raw_po)
    except ValueError:  # not the right format, not enough values to unpack etc
        pass
    except TypeError:  # wrong types in wrong places
        pass
    else:
        purchase_orders.append(po)

Now we have a proper model of what we're seeing, and we can do some work based on those fields. Let's use sorting and itertools.groupby to get this in a reasonable format.

from itertools import groupby
from operator import attrgetter

groups = groupby(sorted(purchase_orders, key=attrgetter('name')), attrgetter('name'))

aggregate_purchase_orders = []
for groupname, group in groups:
    dataset = list(group)  # otherwise we can only iterate once
    name = groupname
    price = sum(po.price for po in dataset)//len(dataset)  # average price
    quantity = sum(po.quantity for po in dataset)
    total_price = sum(po.total_price for po in dataset)
    po = PurchaseOrder(name, price, quantity, total_price)
    aggregate_purchase_orders.append(po)

print(aggregate_purchase_orders)
# see output like:
# [
#   PurchaseOrder(name='Asterix', price=1250, quantity=8, total_price=10000),
#   PurchaseOrder(name='Tintin', price=995, quantity=3, total_price=2985)
# ]
  • Related