get multiple times or filter id__in on a queryset which is more efficient in django?-CodePudding

Which of the following is more efficient in django if the length of ids is not big and the length of queryset might be too large?

items = [item for item in queryset.filter(id__in=ids)]

items = [queryset.get(id=id) for id in ids]

N.B: it's guaranteed that there is no such id that will mismatch. Also there might be several cases where length of ids is 1

CodePudding user response：

The query that uses .filter(…) will be more efficient: this will perform one query on the database, whereas the solution with .get(…) will run n queries on the database, with n the number of elements in ids.

While that .filter(…) query might take a bit longer, constructing the query, sending the query to the database, decoding the query, constructing an execution plan, and sending the result back will all be done once with .filter(…), whereas for the .get(…) solution, this will be done n times.

Note that the two are semantically not equivalent: the .filter(…) will retrieve all items for which an id exists, the .get(…) solution on the other hand will error in case it can not find an id.

In case the number of items to return is huge, you should not store these in a list, but work with an .iterator(…) [Django-doc] to process the items in chunks, so:

for item in queryset.filter(id__in=ids).iterator():
    # do something …

If you would store the items in a list, that would defeat the purpose of the iterator, since then all objects are all "alive" at the same time, and thus there should be enough memory to keep these in memory.

CodePudding user response：

If we want to measure the efficiency then we have to think about the query at first. Because both are doing list comprehension and we have nothing to do with that.

Now if we come to the queryset.filter and queryset.get and try to understand the equivalent SQL query.

SQL query count: For filter there will be only one query but for get there will be n number of query. Even if the n is 2, filter is more efficient than get.
Execution efficiency: This one also relay on SQL query. Because as the get implementation says it will create query and perform it each time. But filter will create a single query and perform it for one time. And transnational execution(Making a query to DB) is always slower than doing it in the code.

You can see the SQL query of queryset.filter by printing the query attribute of it.

I think filter is more efficient than get in this case.