Home > database >  django queryset filter with back reference
django queryset filter with back reference

Time:12-07

I'm a C developer and noob of python, just following django tutorials.
I want to know how to filter queryset by its back references' information.
Below is my models.

# models.py

import datetime
from django.db import models
from django.utils import timezone

class Question(models.Model):
  pub_date = models.DateTimeField('date published')

class Choice(models.Model): 
  question = models.ForeignKey(Question, on_delete=models.CASCADE)

Now, I want to get query set of Question which pub_date is past from now AND is referenced by any Choice. The second statement causes my problem.
Below are what I tried.

# First method
question_queryset = Question.objects.filter(pub_date__lte=timezone.now())
for q in question_queryset.iterator():
  if Choice.objects.filter(question=q.pk).count() == 0: 
    print(q)
    # It works. Only @Question which is not referenced 
    # by any @Choice is printed. 
    # But how can I exclude @q in @question_queryset?
    
# Second method
question_queryset = Question.objects.filter(pub_date__lte=timezone.now()
  & Choice.objects.filter(question=pk).count()>0) # Not works.
# NameError: name 'pk' is not defined
# How can I use @pk as rvalue in @Question.objects.filter context?

Is it because I'm not familiar with Python syntax? Or is the approach itself to data wrong? Do you have any good ideas for solving my problem without changing the model?

=======================================

edit: I just found the way for the first method.

# First method
question_queryset = Question.objects.filter(pub_date__lte=timezone.now())
for q in question_queryset.iterator():
  if Choice.objects.filter(question=q.pk).count() == 0: 
    question_queryset = question_queryset.exclude(pk=q.pk)

A new concern arises: if the number of @Question rows is n and @Choice's is m, above method takes O(n * m) times, right? Is there any way to increase performance? Could it be that my way of handling data is the problem? Or is the structure of the data a problem?

CodePudding user response:

Here is the documentation on how to follow relationships backwards. The following query yields the same result:

queryset = (Question.objects
                    .filter(pub_date__lte=timezone.now())
                    .annotate(num_choices=Count('choice')) 
                    .filter(num_choices__gt=0))

It is probably better to rely on the Django ORM than writing your own filter. I believe that in the best scenario the time complexity will be the same.

Related to the design, this kind of relationship will lead to duplicates in your database, different questions sometimes have the same answer. I would probably go with a many-to-many relationship instead.

CodePudding user response:

Thats not how the querysets are supposed to work. Iterating the quueryset is iterating every data in the queryset that is returned by your database. You don't need to use iterate()

question_queryset = Question.objects.filter(pub_date=timezone.now())
for q in question_queryset:
    if Choice.objects.filter(question=q.pk).count() == 0: 
        print(q)

I didn't test it. But this should work.

CodePudding user response:

question_queryset = Question.objects.filter(pub_date=timezone.now())
for q in question_queryset:
    if Choice.objects.filter(question=q.pk).count() == 0: 
        print(q)
  • Related