Let's say I have the following Django model:
class Team(models.Model):
name = models.CharField(max_length=255)
created_at = models.DateTimeField(auto_now_add=True)
I want to write a query to fetch the latest N number of records per team name.
If N=1, the query is very easy (assuming I'm using postgres because it's the only DB that support distinct(*fields)
):
Team.objects.order_by("name", "-created_at").distinct("name")
If N is greater than 1 (let's say 3), then it gets tricky. How can I write this query in Django?
CodePudding user response:
Not sure how you can get duplicate names per team, since you have unique=True
. But if you plan to remove that to support non-unique names, you can use subqueries like this:
top_3_per_team_name = Team.objects.filter(
name=OuterRef("name")
).order_by("-created_at")[:3]
Team.objects.filter(
id__in=top_3_per_team_name.values("id")
)
Although this can be a bit slow, so make sure you have the indexes setup.
Also to note, ideally this can be solved by using Window
..[Django-doc] functions using DenseRank
..[Django-doc] but unfortunately the latest django version can't filter on windows:
from django.db.models import F
from django.db.models.expressions import Window
from django.db.models.functions import DenseRank
Team.objects.annotate(
rank=Window(
expression=DenseRank(),
partition_by=[F('name'),],
order_by=F('created_at').desc()
),
).filter(rank__in=[range(1,4)]) # 4 is N 1 if N = 3
With the above you get:
NotSupportedError: Window is disallowed in the filter clause.
But there is a plan to support this on Django 4.2 so theoretically the above should work once that is released.
CodePudding user response:
I'm assuming you'll be getting your N from a get
request or something, but as long as you have a number you can try limiting your queryset:
Team.objects.order_by("name", "-created_at").distinct("name")[:3] # for N = 3