I am new to ElastiSearch. I am trying to do bulk insert using Python into the elasticsearch index which is using nlp model through ingest pipeline to convert text into embeddings. But not all the documents are getting inserted only 2000 documents are inserting out of 40k documents.
Elastics Search Version 8.3
Below exception I am getting while calling bulk insert command
{'index': {'_index': 'index_name', '_id': '40962', 'status': 500, 'error': {'type': 'exception', 'reason': 'org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: inference process queue is full. Unable to execute command', 'caused_by': {'type': 'es_rejected_execution_exception', 'reason': 'inference process queue is full. Unable to execute command'}}}},
Please
CodePudding user response:
This is due to inference items being queued up and being rejected. This can happen when there are MANY items being ingested through a model that takes a while to infer.
The solution here is to:
- Increase the inference deployment queue size to match your bulk ingest size (
queue_capacity
query parameter in the start deployment API) - Reduce your bulk request size to the default queue size (1024) and wait for the bulk requests to finish before sending another one.
Some relevant documentation: https://www.elastic.co/guide/en/elasticsearch/reference/8.3/start-trained-model-deployment.html
example of starting a deployment with a specific capacity
POST _ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/deployment/_start?wait_for=started&queue_capacity=2000