Home > Software engineering >  How to code for the value of "month" which produced the highest profit in a year?
How to code for the value of "month" which produced the highest profit in a year?

Time:11-20

Out of all the months in the year, I need to code the month with largest total balance (it's June as all together June has the biggest "amount" value)

lst = [
    {'account': 'x\\*', 'amount': 300, 'day': 3, 'month': 'June'},
    {'account': 'y\\*', 'amount': 550, 'day': 9, 'month': 'May'},
    {'account': 'z\\*', 'amount': -200, 'day': 21, 'month': 'June'},
    {'account': 'g', 'amount': 80, 'day': 10, 'month': 'May'},
    {'account': 'x\\*', 'amount': 30, 'day': 16, 'month': 'August'},
    {'account': 'x\\*', 'amount': 100, 'day': 5, 'month': 'June'},
]

The problem is that both "amount" and the name of the months are values.

I tried to find the total for each month, but I need to use for loop to code the highest month "amount".

My attempt:

get_sum  = lambda my_dict, month: sum(d['amount']
for d in my_list if d['month'] == month)
total_June = get_sum(my_list,'June')
total_August = get_sum(my_list),'August')

CodePudding user response:

A simple solution with pandas.

import pandas as pd

lst = [
    {'account': 'x\\*', 'amount': 300, 'day': 3, 'month': 'June'},
    {'account': 'y\\*', 'amount': 550, 'day': 9, 'month': 'May'},
    {'account': 'z\\*', 'amount': -200, 'day': 21, 'month': 'June'},
    {'account': 'g', 'amount': 80, 'day': 10, 'month': 'May'},
    {'account': 'x\\*', 'amount': 30, 'day': 16, 'month': 'August'},
    {'account': 'x\\*', 'amount': 100, 'day': 5, 'month': 'June'},
]

# convert list of dictionaries to dataframe
df = pd.DataFrame(lst)

# Get the row / series that has max amount. 
# idxmax returns an index for loc.
max_series_by_amount = df.loc[df['amount'].idxmax(axis="index")]

# Get only month and amount in a plain list
print(max_series_by_amount[["month", "amount"]].tolist())
['May', 550]

Please note that using pandas adds a substantial amount of dependencies to the project, that said, pandas is commonly imported anyway for data science or data manipulation tasks. Pierre D solutions here are definitively faster.

CodePudding user response:

One possibility (among many):

from itertools import groupby
from operator import itemgetter

mo_total = {
    k: sum([d.get('amount', 0) for d in v])
    for k, v in groupby(sorted(lst, key=itemgetter('month')), key=itemgetter('month'))
}
>>> mo_total
{'August': 30, 'June': 200, 'May': 630}

>>> max(mo_total.items(), key=lambda kv: kv[1])
('May', 630)

Without itemgetter:

bymonth = lambda d: d.get('month')
mo_total = {
    k: sum([d.get('amount', 0) for d in v])
    for k, v in groupby(sorted(lst, key=bymonth), key=bymonth)
}

Yet another way, using defaultdict:

from collections import defaultdict

tot = defaultdict(int)

for d in lst:
    tot[d['month']]  = d.get('amount', 0)
>>> tot
defaultdict(int, {'June': 200, 'May': 630, 'August': 30})

>>> max(tot, key=lambda k: tot[k])
'May'
  • Related