Home > Mobile >  Speed up summation using caching in Python
Speed up summation using caching in Python

Time:07-30

In each of the 10000 iterations, we select the player with the least total score to play a game, where he will earn a random score.

I think the slowest part of the code in each iteration appears to be the summing of all the players score.

sum(g.score for g in players[x])

Since only one player's score is updated in each iteration, it seems that caching may help speed up the current O(N) implementation into a O(1).

What is a good way to make use of caching? Thank you!

import random


class Game:
    def __init__(self, score):
        self.score = score


players = {f"player{i}": [] for i in range(8)}

for i in range(10000):
    player = min(players, key=lambda x: sum(g.score for g in players[x]))
    players[player].append(Game(random.randint(0, 10)))

for player, games in players.items():
    print(player, ":", sum(g.score for g in games))

CodePudding user response:

If you encapsulate the data and behavior in a Player class you can let that class keep track of the paperwork around totals so the caller of the class doesn't need to know about it. This lets you have code that's easy to ready and efficient at the expense of being a bit longer:

import random

class Game:
    def __init__(self, score):
        self.score = score

class Player:
    def __init__(self, name):
        self.name = name
        self.scores = []
        self.total = 0
        
    def add_game(self, game):
        self.scores.append(game)
        self.total  = game.score


players = [Player(f"player{i}") for i in range(8)]


for i in range(10000):
    player = min(players, key=lambda player: player.total)
    player.add_game(Game(random.randint(0, 10)))

for player in players:
    print(player.name, ":", player.total)

This prints with no noticeable delay:

player0 : 6280
player1 : 6281
player2 : 6283
player3 : 6280
player4 : 6279
player5 : 6283
player6 : 6277
player7 : 6283

CodePudding user response:

One solution

One approach is to use a dictionary to keep track of the total score:

players = {f"player{i}": [] for i in range(8)}
total_player_score = dict.fromkeys(players, 0)

for i in range(10000):
    player = min(total_player_score, key=total_player_score.get)
    game = Game(random.randint(0, 10))
    players[player].append(game)
    total_player_score[player]  = game.score

for player, games in players.items():
    print(player, ":", sum(g.score for g in games))

Output (of a single run)

player0 : 6252
player1 : 6250
player2 : 6247
player3 : 6247
player4 : 6244
player5 : 6248
player6 : 6250
player7 : 6245

An even faster solution (O(mlogn))

On top of that, I suggest you use a heap, to keep track of the minimum:

import heapq

players = {f"player{i}": [] for i in range(8)}
total_player_score = [(0, player) for player in players]
heapq.heapify(total_player_score)

for i in range(10000):
    score, player = heapq.heappop(total_player_score)
    game = Game(random.randint(0, 10))
    players[player].append(game)
    heapq.heappush(total_player_score, (score   game.score, player))

This second approach converts the code from O(m*n) to O(m*logn) where n is the number of players and m the number of iterations (range(1000) in the code)

  • Related