Home > Back-end >  Does reading an elastic document by _id count as a search for the `refresh_interval`
Does reading an elastic document by _id count as a search for the `refresh_interval`

Time:08-14

In the write tuning section, Elastic recommends to Increase the Refresh Interval

We're doing document ingestions where during ingestion we may do reads, essentially like,

GET /my-index/_doc/mydocumentid

that is, a read of the document by its _id, as opposed to a search. Some descriptions suggest that the document id is just added to the Lucene index like other attributes. Does this mean that the read by id would still reset the refresh_interval and force a re-index instead of allowing it to wait for the full refresh_interval?

CodePudding user response:

This is actually a tricky one:

You are correct that a GET on an _id works right away (unlike a multi-document operation like a search, which need to wait for an explicit ?refresh from you or the refresh_interval). But the underlying implementation changed twice:

  1. Initially the GET on an _id read the data right from the translog, so it didn't need a refresh / the creation of a segment.
  2. The code was complex and so we changed it in 5.0 that it would be read from a segment, but a GET on an _id would automatically trigger the _refresh. So it looked the same on the outside and the code was simpler.
  3. But for use-cases that did a lot of GETs on _id this was expensive, since it creates lots of tiny shards. So we changed it back in 7.6 to read again from the translog.

So if you are using a current version, it doesn't trigger a _refresh.

CodePudding user response:

a get on the _id is not a search, so no

  • Related