When filtering a queryset on a nullable foreign key, I can filter by an ID value (foo_id=123
) or by None (foo_id=None
). However, if I attempt to filter by a list (foo_id__in=[123, None]
) the None
is ignored.
Why is this happening, and what is the best workaround for filtering on a foreign key using a list which includes None?
Example:
from django.db import models
class Foo(models.Model):
name = models.CharField(max_length=100)
class Bar(models.Model):
foo = models.ForeignKey(Foo, on_delete=models.PROTECT,
blank=True, null=True)
foo = Foo.objects.create(name='myfoo')
Bar.objects.create(foo=foo)
Bar.objects.create(foo=None)
Bar.objects.count() # 2
Bar.objects.filter(foo_id=foo.id).count() # 1
Bar.objects.filter(foo_id=None).count() # 1
Bar.objects.filter(foo_id__in=[foo.id, None]).count() # 1 - Expected 2!
CodePudding user response:
The key insight is that in SQL, NULL
represents an unknown value, which cannot be compared using normal operators:
SELECT NULL = NULL;
-- => NULL
(This syntax does not work in all DB engines - for example, SQL Server - and in those you would have to write something like SELECT CASE WHEN NULL = NULL THEN 't' ELSE 'f' END
, but the result is same: NULL = NULL
evaluates as NULL
, which is falsy.)
The reasoning is, for example, if you have two people whose last name you don't know, you'd mark them as NULL
- and just because they are both NULL
, you can't conclude that they have the same last name (just like you can't conclude they have a different one - you just don't know one way or the other).
So, a NULL
is not equal to another NULL
... but it's also not different than another NULL
: NULL <> NULL
also returns NULL
. In fact, NULL
infects all operators: 1 NULL
, 1 < NULL
, 1 >= NULL
... all result in NULL
. If you do anything with an unknown value, the result is an unknown value.
There is basically only one operator that avoids this infectious nature of NULL
, and that is IS NULL
:
SELECT NULL IS NULL;
-- => t
In the same vein as x = NULL
, x IN (NULL)
is also using the same equality comparison, which can never evaluate as true:
SELECT 2 IN (1, NULL);
-- => NULL
SELECT NULL IN (1, NULL);
-- => NULL
SELECT 1 IN (1, NULL);
-- => t
SELECT 2 NOT IN (1, NULL);
-- => NULL
SELECT NULL NOT IN (1, NULL);
-- => NULL
SELECT 1 NOT IN (1, NULL);
-- => t
Is 2
there? Maaaaybe; I have a value that I don't know what it is, so I can't say whether 2
is in there or not, because it just might match that unknown one. How about this other thing that I have no idea about? Beats me, might be 1
, or might be equal to the other unknown thing - or it might be something different altogether. How about 1
? Well, as to that, I can see 1
in there, regardless of what the unknown thing is or isn't.
So instead of plain IN
, you have to check for NULL
explicitly:
SELECT * WHERE foo_id IN (1, 2) OR foo_id IS NULL;
In django terms:
from django.db.models import Q
Bar.objects.filter(Q(foo_id=foo.id) | Q(foo_id__isnull=True)).count()
or if you have several values
Bar.objects.filter(Q(foo_id__in=[1, 2]) | Q(foo_id__isnull=True)).count()
CodePudding user response:
I don't know why the None
in foo_id__in=[123, None]
is ignored, but the best workaround I've come up with is the following:
Bar.objects.filter(Q(foo_id=foo.id) | Q(foo_id=None)).count()