Home > Mobile >  Efficiently calculate averages of keys in dictionary
Efficiently calculate averages of keys in dictionary

Time:06-04

I have a dictionary that looks like this:

    'TICKET_URL/3250': {'cycle_time': 0, 'lead_time': 2496441},
    'TICKET_URL/3323': {'cycle_time': 346087, 'lead_time': 508469},
    'TICKET_URL/3328': {'cycle_time': 249802, 'lead_time': 521211},
    'TICKET_URL/3352': {'cycle_time': 504791, 'lead_time': 504791},
    'TICKET_URL/3364': {'cycle_time': 21293, 'lead_time': 21293},
    'TICKET_URL/3367': {'cycle_time': 102558, 'lead_time': 189389},
    'TICKET_URL/3375': {'cycle_time': 98735,  'lead_time': 98766}
}

How can I efficiently calculate the average cycle_time and lead_time (independently). Right now I'm iterating over the dictionary twice - once for cycle_time and once for lead_time. Can I do this in a single pass?

Currently:

average_cycle = (
    sum([story["cycle_time"] for story in stories.values()]) / len(stories)
)

CodePudding user response:

If you don't mind pandas,

import pandas as pd

pd.DataFrame(data.values()).mean().to_dict()

will produce:

{'cycle_time': 189038.0, 'lead_time': 620051.4285714285}

As a bonus, it will also handle missing values nicely.

CodePudding user response:

count = len(stories.values())
cycle_total = 0
lead_total = 0
for story in stories.values():
    cycle_total  = story.get("cycle_time", 0)
    lead_total  = story.get("lead_time", 0)

cycle_avg = cycle_total / count
lead_avg = lead_total / count

CodePudding user response:

I don't really see the issue with your current implementation. I would suggest adding a avg() function to maybe enhance readability:

def avg(x):
    return sum(x) / len(x) 
avg_cycle = avg([story['cycle_time'] for story in stories.values()])
avg_lead_time = avg([story['lead_time'] for story in stories.values])
  • Related