I have an annotation like this: which displays the month wise count of a field
bar = Foo.objects.annotate(
item_count=Count('item')
).order_by('-item_month', '-item_year')
and this produces output like this: html render
I would like to show the change in item_count when compared with the previous month item_count for each month (except the first month). How could I achieve this using annotations or do I need to use pandas?
Thanks
Edit: In SQL this becomes easy with LAG function, which is similar to
SELECT item_month, item_year, COUNT(item),
LAG(COUNT(item)) OVER (ORDER BY item_month, item_year)
FROM Foo
GROUP BY item_month, item_year
(PS: item_month and item_year are date fields)
Do Django ORM have similar to LAG in SQL?
CodePudding user response:
For these types of Query you need to use Window functions in django Orm
For Lag you can take the help of
https://docs.djangoproject.com/en/4.0/ref/models/database-functions/#lag
Working Query in Orm will look like this :
#models.py
class Review(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE, related_name='review_user', db_index=True)
review_text = models.TextField(max_length=5000)
rating = models.SmallIntegerField(
validators=[
MaxValueValidator(10),
MinValueValidator(1),
],
)
date_added = models.DateTimeField(db_index=True)
review_id = models.AutoField(primary_key=True, db_index=True)
This is just a dummy table to show you the use case of Lag
and Window
function in django
Because examples are not available for Lag function on Django Docs.
from django.db.models.functions import Lag, ExtractYear
from django.db.models import F, Window
print(Review.objects.filter().annotate(
num_likes=Count('likereview_review')
).annotate(item_count_lag=Window(expression=Lag(expression=F('num_likes')),order_by=ExtractYear('date_added').asc())).order_by('-num_likes').distinct().query)
Query will look like
SELECT DISTINCT `temp_view_review`.`user_id`, `temp_view_review`.`review_text`, `temp_view_review`.`rating`, `temp_view_review`.`date_added`, `temp_view_review`.`review_id`, COUNT(`temp_view_likereview`.`id`) AS `num_likes`, LAG(COUNT(`temp_view_likereview`.`id`), 1) OVER (ORDER BY EXTRACT(YEAR FROM `temp_view_review`.`date_added`) ASC) AS `item_count_lag` FROM `temp_view_review` LEFT OUTER JOIN `temp_view_likereview` ON (`temp_view_review`.`review_id` = `temp_view_likereview`.`review_id`) GROUP BY `temp_view_review`.`review_id` ORDER BY `num_likes` DESC
Also if you don't want to order_by on extracted year of date then you can use F
expressions like this
print(Review.objects.filter().annotate(
num_likes=Count('likereview_review')
).annotate(item_count_lag=Window(expression=Lag(expression=F('num_likes')),order_by=[F('date_added')])).order_by('-num_likes').distinct().query)
Query for this :
SELECT DISTINCT `temp_view_review`.`user_id`, `temp_view_review`.`review_text`, `temp_view_review`.`rating`, `temp_view_review`.`date_added`, `temp_view_review`.`review_id`, COUNT(`temp_view_likereview`.`id`) AS `num_likes`, LAG(COUNT(`temp_view_likereview`.`id`), 1) OVER (ORDER BY `temp_view_review`.`date_added`) AS `item_count_lag` FROM `temp_view_review` LEFT OUTER JOIN `temp_view_likereview` ON (`temp_view_review`.`review_id` = `temp_view_likereview`.`review_id`) GROUP BY `temp_view_review`.`review_id` ORDER BY `num_likes` DESC