I am new to Elasticsearch. I have a huge index with around 50k documents. I have to update all the documents, when I run the update_by_query function it is throwing an error
File "E:\ApplicationsRunning\Lib\site-packages\opensearchpy\connection\http_urllib3.py", line 254, in perform_request raise ConnectionTimeout("TIMEOUT", str(e), e) opensearchpy.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10)
How can I resolve this error or how can I update all the documents in the index?
query = {
"script": {
"inline": "ctx._source.name='srujan'"
},
"query": {
"match_all": {}
}
}
response = client.update_by_query(
body=query, index=_index, wait_for_completion=True)
CodePudding user response:
It's because you're hitting a connection timeout as the update takes a bit longer than the default timeout.
You can increase the timeout as shown by Musab in his comment, or...
... you can also set wait_for_completion=False
, the call will return immediately with the ID of an asynchronous task that will run in the background.
You can then check the completion of this task in Kibana Dev Tools, using
GET _tasks/<task_id>