Home > OS >  Why is my elasticsearch update_by_query timing out?
Why is my elasticsearch update_by_query timing out?

Time:04-26

I am using the Javascript library and and trying to perform a rather large update on documents. I have changed the timeout value on the query itself as well as the client object and am still getting a timeout that seems to be 2 minutes. Below is the query.

const arrayOfStrings = [...]; // This array could be between 10-12k elements in size
await elasticClient.updateByQuery({
        index: "main-index",
        refresh: true,
        conflicts:"proceed",
        script: {
            lang: "painless",
            source: "ctx._source.status = \"1\"; ctx._source.entities = [];  for(term in params.array) ctx._source.entities.add(term);",
            params: {
                "array": arrayOfStrings
            }
        },
        query: { 
            "term": {
                "parent_id": 'dgd39gd3-db3dg23879-d893gdg38-e23ed' // There could be upwards of 5000 documents that match this criteria
            }
        },
        timeout: "5m"
});

Also on the elastic client, I have requestTimeout set to '5m' as well. I can understand why this particular query may be timing out as its applying a 12k element array to a field on 5k documents. However I dont understand why the query is timing out after only 2 minutes when I have timeout values set that are longer.

CodePudding user response:

What you need to do here is to run the update by query asynchronously

await elasticClient.updateByQuery({
    index: "main-index",
    refresh: true,
    conflicts:"proceed",
    waitForCompletion: false              <---- add this setting

And then you can follow the progress of the task running asynchronously.

  • Related