I'm trying to optimize the fired queries of an API. I have four models namely User, Content, Rating, and UserRating with some relations to each other. I want the respective API returns all of the existing contents alongside their rating count as well as the score given by a specific user to that.
I used to do something like this: Content.objects.all()
as a queryset, but I realized that in the case of having a huge amount of data tons of queries will be fired. So I've done some efforts to optimize the fired queries using select_related()
and prefetch_related()
. However, I'm dealing with an extra python searching, that I hope to remove that, using a controlled prefetch_related()
— applying a filter just for a specific prefetch
in a nested prefetch
and select
.
Here are my models:
from django.db import models
from django.conf import settings
class Content(models.Model):
title = models.CharField(max_length=50)
class Rating(models.Model):
count = models.PositiveBigIntegerField(default=0)
content = models.OneToOneField(Content, on_delete=models.CASCADE)
class UserRating(models.Model):
user = models.ForeignKey(
settings.AUTH_USER_MODEL, blank=True, null=True, on_delete=models.CASCADE
)
score = models.PositiveSmallIntegerField()
rating = models.ForeignKey(
Rating, related_name="user_ratings", on_delete=models.CASCADE
)
class Meta:
unique_together = ["user", "rating"]
Here's what I've done so far:
contents = (
Content.objects.select_related("rating")
.prefetch_related("rating__user_ratings")
.prefetch_related("rating__user_ratings__user")
)
for c in contents: # serializer like
user_rating = c.rating.user_ratings.all()
for u in user_rating: # how to remove this dummy search?
if u.user_id == 1:
print(u.score)
Queries:
(1) SELECT "bitpin_content"."id", "bitpin_content"."title", "bitpin_rating"."id", "bitpin_rating"."count", "bitpin_rating"."content_id" FROM "bitpin_content" LEFT OUTER JOIN "bitpin_rating" ON ("bitpin_content"."id" = "bitpin_rating"."content_id"); args=(); alias=default
(2) SELECT "bitpin_userrating"."id", "bitpin_userrating"."user_id", "bitpin_userrating"."score", "bitpin_userrating"."rating_id" FROM "bitpin_userrating" WHERE "bitpin_userrating"."rating_id" IN (1, 2); args=(1, 2); alias=default
(3) SELECT "users_user"."id", "users_user"."password", "users_user"."last_login", "users_user"."is_superuser", "users_user"."first_name", "users_user"."last_name", "users_user"."email", "users_user"."is_staff", "users_user"."is_active", "users_user"."date_joined", "users_user"."user_name" FROM "users_user" WHERE "users_user"."id" IN (1, 4); args=(1, 4); alias=default
As you can see on the above fired queries I've just only three queries rather than too many queries which were happening in the past. However, I guess I can remove the python searching (second for
iteration) using a filter on my latest query — users_user"."id" IN (1,)
instead. According to this post and my efforts, I couldn't apply a .filter(rating__user_ratings__user_id=1)
on the third query. Actually, I couldn't match my problem using Prefetch(..., queryset=...)
instance given in this answer.
CodePudding user response:
I think you are looking for Prefetch object: https://docs.djangoproject.com/en/4.0/ref/models/querysets/#prefetch-objects
Try this:
from django.db.models import Prefetch
contents = Content.objects.select_related("rating").prefetch_related(
Prefetch(
"rating__user_ratings",
queryset=UserRating.objects.filter(user__id=1),
to_attr="user_rating_number_1",
)
)
for c in contents: # serializer like
print(c.rating.user_rating_number_1[0].score)