Home > Back-end >  How can I get the average score calculation to be less than 0.01 seconds?
How can I get the average score calculation to be less than 0.01 seconds?

Time:08-18

I created a function to average the comments in TVAndMovie. However, when I created 100000 comments, it took an average of 4 seconds to get the average of the scores! I have been trying to find a way to get the average of the scores. How can I reduce the time to less than 0.01 seconds?

class TVAndMovie(models.Model):
    tmdb_id = models.IntegerField(
        verbose_name="",
        blank=False,
        null=False,
    )
    
    judge_tv_or_movie = models.CharField(
        blank=False, null=False, default="movie", max_length=20
    )
    stars = models.FloatField(
        blank=False,
        null=False,
        default=0,
        validators=[MinValueValidator(0.0), MaxValueValidator(10.0)],
    )

    def get_comments(self) -> object:
        return Comment.objects.filter(
            tv_or_movie_id=self.id
        )

    def average_stars(self) -> float:
        comments = self.get_comments()
        n_comments = comments.count()
        if n_comments:
            self.stars = round(
                sum([comment.stars for comment in comments]) / n_comments, 3
            )
        else:
            self.stars = 0
        self.save()
        return self.stars

class Comment(models.Model):

    comment = models.TextField(max_length=1000)
    stars = models.FloatField(
        blank=False,
        null=False,
        default=0,
        validators=[MinValueValidator(0.0), MaxValueValidator(10.0)],
    )
    user = models.ForeignKey(CustomUser, on_delete=models.CASCADE)
    tv_or_movie = models.ForeignKey(TVAndMovie, on_delete=models.CASCADE)
    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    class Meta:
        unique_together = ("user", "tv_or_movie")
        indexes = [models.Index(fields=["user", "tv_or_movie"])]

CodePudding user response:

So I decided to debug your code and measure performance - since I use faster computer (for 100k records it executes the average_stars method in less than seconds) I created 500k Comments. Then this method executes in around 4.5s.

I measured the performance using debug view:

class IndexView(TemplateView):

    def get(self, *args, **kwargs):
        tv_or_movie = TVAndMovie.objects.first()
        start = time.time()
        stars = tv_or_movie.average_stars()
        end = time.time()
        return JsonResponse({
            'finish': end-start,
            'rate': stars,
        })

Then, I tried to write optimized code - instead of calling direct query I decided to use _set directive and then cast stars values to list using values_list method.

This way I got 0.2s execution time (performance). Code:

    def average_stars(self) -> float:
        get_stars = list(self.comment_set.values_list('stars', flat=True))
        if get_stars:
            self.stars = round(
                sum(get_stars) / len(get_stars), 3
            )
        self.save()
        return self.stars

Generally using .count() is not bad, but len() is faster, and you can execute it on list that you got while caching stars values to said list.

Also I think this is the fastest AND simplest way possible to calculate average rating, you will probably not get less than 0.01s performance, but I think it's enough for you.

If you have any questions reply to me in comments.

--- UPDATE ---

I forgot that Django also supports Sum in Django ORM. So basically you can simply calculate the average using this logic:

    def average_stars(self) -> float:
        count = self.comment_set.count()
        if count:
            get_sum = self.comment_set.aggregate(Sum('stars'))
            self.stars = round(
                get_sum['stars__sum'] / count, 3
            )
        self.save()
        return self.stars

Which is double time faster, executing in around 0.1s.

  • Related