Let's say that I have model
like:
class User(AbstractUser):
...
seller = models.ForeignKey(Seller, related_name="user", on_delete=models.SET_NULL, null=True)
...
And I trying to get seller
email address using this code:
from app.models import User
from django_print_sql import print_sql
with print_sql(count_only=False):
users = User.objects.filter(is_active=True, seller_id__isnull=False).select_related().only('seller__email')
for u in users.iterator():
email = u.seller.email
send_email(email)
In this case I can see SQL queries like:
SELECT `user`.`id`,
`user`.`seller_id`
FROM `user`
WHERE (`user`.`is_active`
AND `user`.`seller_id` IS NOT NULL)
...
SELECT `seller`.`id`,
...
`seller`.`email`,
...
FROM `seller`
WHERE `seller`.`id` = 1
...
The problem is Django ORM accessing DB at every iteration (Select seller... where seller.id = ...
). So that will be too many queries (== DB connections) if we have many sellers.
In other way it is possible to replace only
with values
:
from app.models import User
from django_print_sql import print_sql
with print_sql(count_only=False):
users = User.objects.filter(is_active=True, seller_id__isnull=False).select_related().values('seller__email')
for u in users.iterator():
email = u['seller__email']
send_email(email)
And I can see SQL query like:
SELECT `seller`.`email`
FROM `user`
INNER JOIN `seller` ON (`user`.`seller_id` = `seller`.`id`)
WHERE (`user`.`is_active`
AND `user`.`seller_id` IS NOT NULL)
It is little bit better and we able to get emails by single DB query, but iterator
is useless here because we load all emails once inside dict
object and whole data exists inside memory.
The question is
Is it possible to iterate by data chunks (Let say by 500 emails/single query) without create manual sub-loops limiting User.objects.filter
query count? Or in other words, what is most effective iterator in this case?
CodePudding user response:
Looks like select_related
is working for this case:
from app.models import User
from django_print_sql import print_sql
with print_sql(count_only=False):
users = User.objects.filter(is_active=True, seller_id__isnull=False).select_related('seller').only('seller__email').exclude(seller__email__exact='')
for u in users.iterator():
send_email(u.seller.email)
SELECT `user`.`id`,
`user`.`owner_id`,
`seller`.`id`,
`seller`.`email`
FROM `user`
INNER JOIN `seller` ON (`user`.`seller_id` = `seller`.`id`)
WHERE (`user`.`is_active`
AND `user`.`seller_id` IS NOT NULL
AND NOT (`seller`.`email` =
AND `seller`.`email` IS NOT NULL))
CodePudding user response:
In Your case You give no parameters for select_related
, I suggest You to put seller
field to select_related
like this:
from app.models import User
from django_print_sql import print_sql
with print_sql(count_only=False):
users = User.objects.select_related('seller').filter(is_active=True, seller_id__isnull=False).only('seller__email')
for u in users.iterator():
email = u.seller.email
send_email(email)
putting select_related
before objects
manager. By doing this django retreives seller model records as a field for every iterated user and will not hit the database in every iteration.