I have a model that register some kind of event and the date in which it occurs. I need to calculate: 1) the count of events for each date, and 2) the cumulative count of events over time.
My model looks something like this:
class Event(models.Model):
date = models.DateField()
...
Calculating 1) is pretty straightforward, but I'm having trouble calculating the cumulative sum. I tried something like this:
query_set = Event.objects.values("date") \
.annotate(count=Count("date")) \
.annotate(cumcount=Window(Sum("count"), order_by="date"))
But I'm getting this error:
Cannot compute Sum('count'): 'count' is an aggregate
Edit: Ideally, I'd like to have a query set equivalent to this SQL query, using Django's ORM:
SELECT date,
COUNT(date) as count,
SUM(COUNT(date)) OVER(ORDER BY date) acc_count
FROM event_event
GROUP BY date
CodePudding user response:
It's surprisingly common to see developers wanting to convert from SQL query to Django QuerySet.
In this case, as OP already knows SQL, OP might be better off just performing raw SQL query.
There's different ways one can go about doing it, like executing custom SQL directly.
from django.db import connection
def my_custom_sql(self):
with connection.cursor() as cursor:
cursor.execute("SELECT date, COUNT(date) as count, SUM(COUNT(date)) OVER(ORDER BY date) acc_count
FROM event_event
GROUP BY date")
Then, call cursor.fetchone()
or cursor.fetchall()
to return the resulting rows.
CodePudding user response:
In some cases perform an aggregate of an aggregate are not valid in SQL, whether you're using the ORM or not, for example: MAX(SUM(...))
.
In your case you can do it by a raw query (as already mentioned in the other answer(s) and in your query).
Or using the ORM as following:
subquery = (
Event.objects.filter(date=OuterRef("date")) # we need this for the join
.values("date") # this to create the group by
.annotate(subcount=Count("date")) # the aggregate function
)
Event.objects.values("date").annotate(count=Count("date")).annotate(
sumcount=Window(Sum(subquery.values("subcount")), order_by="date")
# above we can use the Sum with the subquery
# we can also replace it for any aggregation functions that we want
).values("date", "count", "cumcount")
It will generate the following SQL:
SELECT
"app_event"."date",
COUNT("app_event"."date") AS "count",
SUM((SELECT
COUNT(U0."date") AS "subcount"
FROM
"app_event" U0
WHERE
U0."date" = ("app_event"."date")
GROUP BY
U0."date"
)) OVER ( ORDER BY "app_event"."date")
AS "cumcount"
FROM
"app_event"
GROUP BY
"app_event"."date"