Hi stackoverflow community, my question is about django annotate.
Basically what I am trying to do is to find duplicated value with same values from two different fields in two different tables.
This is my models.py
class Order(models.Model):
id_order = models.AutoField(primary_key=True)
class OrderDelivery(models.Model):
order = models.ForeignKey(Order, on_delete=models.SET_NULL, null=True, blank=True)
delivery_address = models.TextField()
class OrderPickup(models.Model):
order = models.ForeignKey(Order, on_delete=models.SET_NULL, null=True, blank=True)
pickup_date = models.DateField(blank=True, null=True)
This is my current code:
dup_job = Order.objects.filter(
orderpickup__pickup_date__range=(start_date, end_date)
).values(
'orderdelivery__delivery_address',
'orderpickup__pickup_date',
).annotate(
duplicated=Count('orderdelivery__delivery_address')
).filter(
duplicated__gt=1
)
Based on what I have, I am getting result like this (delivery_address is omitted for privacy purpose):
{'orderdelivery__delivery_address': '118A', 'orderpickup__pickup_date': datetime.date(2022, 3, 9), 'duplicated': 2}
{'orderdelivery__delivery_address': '11', 'orderpickup__pickup_date': datetime.date(2022, 3, 2), 'duplicated': 6}
{'orderdelivery__delivery_address': '11 A ', 'orderpickup__pickup_date': datetime.date(2022, 3, 3), 'duplicated': 5}
{'orderdelivery__delivery_address': '21', 'orderpickup__pickup_date': datetime.date(2022, 3, 10), 'duplicated': 3}
{'orderdelivery__delivery_address': '642', 'orderpickup__pickup_date': datetime.date(2022, 3, 7), 'duplicated': 2}
{'orderdelivery__delivery_address': '642', 'orderpickup__pickup_date': datetime.date(2022, 3, 8), 'duplicated': 2}
{'orderdelivery__delivery_address': 'N/A,5', 'orderpickup__pickup_date': datetime.date(2022, 3, 8), 'duplicated': 19}
Is there a way to get the id_order of those 'duplicated'? I have tried include id_order in .values() but the output will not be accurate as the annotation is grouping by the id_order instead of delivery_address.
Thank you in advance
CodePudding user response:
You can get the smallest (or largest) item with a Min
[Django-doc] (or Max
) aggregate:
from django.db.models import Min
dup_job = Order.objects.filter(
orderpickup__pickup_date__range=(start_date, end_date)
).values(
'orderdelivery__delivery_address',
'orderpickup__pickup_date',
).annotate(
min_id_order=Min('id_order')
duplicated=Count('orderdelivery__delivery_address')
).filter(
duplicated__gt=1
)
or for postgresql, you can make use of the ArrayAgg
[Django-doc] to generate a list:
# PostgreSQL only
from django.contrib.postgres.aggregates import ArrayAgg
dup_job = Order.objects.filter(
orderpickup__pickup_date__range=(start_date, end_date)
).values(
'orderdelivery__delivery_address',
'orderpickup__pickup_date',
).annotate(
min_id_order=ArrayAgg('id_order')
duplicated=Count('orderdelivery__delivery_address')
).filter(
duplicated__gt=1
)