Home > Software engineering >  Forcing evaluation of prefetched results in django queries
Forcing evaluation of prefetched results in django queries

Time:08-04

I'm trying to use select_related/prefetch_related to optimize some queries. However I have issues in "forcing" the queries to be evaluated all at once.

Say I'm doing the following:

fp_query = Fourprod.objects.filter(choisi=True).select_related("fk_fournis")    
pf = Prefetch("fourprod", queryset=fp_query)                # 
products = Products.objects.filter(id__in=fp_query).prefetch_related(pf)

With models:

class Fourprod(models.Model):
    fk_produit = models.ForeignKey(to=Produit, related_name="fourprod")
    fk_fournis = models.ForeignKey(to=Fournis,related_name="fourprod")
    choisi = models.BooleanField(...)


class Produit(models.Model):
    ... ordinary fields... 

class Fournis(models.Model):
    ... ordinary fields... 

So essentially, Fourprod has a fk to Fournis, Produit, and I want to prefetch those when I build the Produits queryset. I've checked in debug that the prefetch actually occurs and it does.

I have a bunch of fields from different models I need to use to compute results. I don't really control the table structure, so I have to work with this. I can't come up with a reasonable query to do it all with the queries (or using raw), so I want to compute stuff python-side. It's a few 1000 objects, so reasonable to do in-memory. So I cast to a list to force the query evaluation:

products = list(products)

At this point, I would think that the Products and the related objects that I have pre-fetched should have been fetched from the DB. In the logs, just after the list() call, I get this:

02/08/22 15:21:08 DEBUG DEFAULT: (0.019) SELECT "products_fourprod"."id", "products_fourprod"."fk_produit_id", "products_fourprod"."fk_fournis_id", "products_fourprod"."choisi", "products_fourprod"."code_four", "products_fourprod"."prix", "products_fourprod"."comment", "products_fournis"."id", "products_fournis"."fk_user_create_id", "products_fournis"."nom", "products_fournis"."adresse", "products_fournis"."ville", "products_fournis"."tel", "products_fournis"."fax", "products_fournis"."contact", "products_fournis"."note", "products_fournis"."pays", "products_fournis"."province", "products_fournis"."postal", "products_fournis"."monnaie", "products_fournis"."tel_long", "products_fournis"."inactif", "products_fournis"."inuse", "products_fournis"."par", "products_fournis"."fk_langue", "products_fournis"."NOTE2" FROM "products_fourprod" LEFT OUTER JOIN "products_fournis" ON ("products_fourprod"."fk_fournis_id" = "products_fournis"."id") WHERE ("products_fourprod"."choisi" AND "products_fourprod"."fk_produit_id" IN (... all Product.id meeting the conditions...)

But then, the list comprehension using the products takes forever to complete:

rows = [[p.id, p.fourprod.first().id, p.desuet, p.no_prod, ... ] for p in products]  

With apparently each single call to p.fourprod resulting in a DB hit:

02/08/22 15:26:19 DEBUG DEFAULT: (0.000) SELECT "products_fourprod"."id", "products_fourprod"."fk_produit_id", "products_fourprod"."fk_fournis_id", "products_fourprod"."choisi", "products_fourprod"."code_four", "products_fourprod"."prix", "products_fourprod"."comment", "products_fournis"."id", "products_fournis"."fk_user_create_id", "products_fournis"."nom", "products_fournis"."adresse", "products_fournis"."ville", "products_fournis"."tel", "products_fournis"."fax", "products_fournis"."contact", "products_fournis"."note", "products_fournis"."pays", "products_fournis"."province", "products_fournis"."postal", "products_fournis"."monnaie", "products_fournis"."tel_long", "products_fournis"."inactif", "products_fournis"."inuse", "products_fournis"."par", "products_fournis"."fk_langue", "products_fournis"."NOTE2" FROM "products_fourprod" LEFT OUTER JOIN "products_fournis" ON ("products_fourprod"."fk_fournis_id" = "products_fournis"."id") WHERE ("products_fourprod"."choisi" AND "products_fourprod"."fk_produit_id" = 1185) ORDER BY "products_fourprod"."id" ASC LIMIT 1; args=(1185,)
02/08/22 15:26:19 DEBUG DEFAULT: (0.000) SELECT "products_fourprod"."id", (.... more similar db hits... )

If I remove all the uses of related objects, then the list() call has actually forced the db hit already and the query executes quickly.

So.... if simply calling products = list(products) does not force the db to be queried for the prefetched objects as well, is there any ways I can make django's orm do so?

CodePudding user response:

From the docs:

Remember that, as always with QuerySets, any subsequent chained methods which imply a different database query will ignore previously cached results, and retrieve data using a fresh database query.

first() implies a database query, so that will cause your query to not use the prefetched values.

Try to use p.fourprod.all()[0] instead to access the first related fourprod instead.

  • Related