How to write the most efficient Django ORM query for the following scenario? I need to get items based on a list of accountIds, but it will return duplicate records with the same accountId because accountId is not the primary key. Then I will need to remove the duplicates by only returning the last created record in the queryset. I can use a for loop to loop through the list of accountIds and filter by each accountId and then order by the created date and get the latest one. However, with this approach, I will be calling the database so many times. There are more than 200 account Ids.
Are there better ways of doing this?
CodePudding user response:
This could be useful
Model.objects.order_by('date_created').distinct()
docs: distinct in django queryset if you are using postgres, it would be much useful and efficient
CodePudding user response:
If using PostgreSQL you can add a field name to distinct()
to create a SELECT DISTINCT ON (foo)
query that returns the first unique value for that field. In your case if you order by account_id
and then descending created_date
you will get a single row per account_id
that has the latest created_date
Item.objects.filter(
account_id__in=account_ids
).order_by(
'account_id', '-created_date'
).distinct('account_id')