Home > Software design >  Elasticsearch Track total hits alternative with approximation
Elasticsearch Track total hits alternative with approximation

Time:01-06

Based on this article - link there are some serious performance implications with having track_total_hits property set to true.

We currently use it to get the number of documents matching after users search. Then user can use pagination to scroll through the results. The number of documents for such a search usually ranges from 10k - 5M.

Example of a user work flow:

  1. User performs a search which matches 150.000 documents
  2. We show him the first 200 results which he can scroll through but we also show him the total number of documents found in the search.

Since we always show the number of document searches and often those numbers can be quite high we need some kind of a way to get that count. I'm not sure but if we almost always perform paginated searches I would assume a lot of the things would be in memory ? Maybe then this actually effects us less then how it's shown in the provided article?

Some kind of an approximation and not an exact count would be ok for us if it would improve performance.

Is there such an option in Elasticsearch where we can get approximated count on search requests ?

CodePudding user response:

There is no option to get an approximate count, but you may want to consider assigning track_total_hits a lower bound instead of true , which is a good compromise from a performance standpoint ( https://www.elastic.co/guide/en/elasticsearch/reference/master/search-your-data.html#track-total-hits)

That way, you can show users that there are at least k results - but there could be more.

Also, try using search_after (if you are not using it already) for pagination.

  • Related