Home > Software engineering >  Opensearch update by query
Opensearch update by query

Time:01-25

I am new to Elasticsearch. I have a huge index with around 50k documents. I have to update all the documents, when I run the update_by_query function it is throwing an error

File "E:\ApplicationsRunning\Lib\site-packages\opensearchpy\connection\http_urllib3.py", line 254, in perform_request raise ConnectionTimeout("TIMEOUT", str(e), e) opensearchpy.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10)

How can I resolve this error or how can I update all the documents in the index?

query = {
    "script": {
        "inline": "ctx._source.name='srujan'"
    },
    "query": {
        "match_all": {}
    }
}
response = client.update_by_query(
    body=query, index=_index, wait_for_completion=True)

CodePudding user response:

It's because you're hitting a connection timeout as the update takes a bit longer than the default timeout.

You can increase the timeout as shown by Musab in his comment, or...

... you can also set wait_for_completion=False, the call will return immediately with the ID of an asynchronous task that will run in the background.

You can then check the completion of this task in Kibana Dev Tools, using

GET _tasks/<task_id>
  • Related