Home > Mobile >  How can I send multiple documents to Elasticsearch datastream using Python?
How can I send multiple documents to Elasticsearch datastream using Python?

Time:09-01

I am trying to index a large number of documents in Python to Elasticsearch, after reading the documentation, they refer to this example. (edit: I am using this exact code, only changed the index name into a datastream.)

This example works great when I am indexing into a normal index, however, when I try to index into a datastream, even into a brand new datastream, that can accept dynamic content, I get this error:

Traceback (most recent call last):
  File "/Users/Downloads/elasticsearch-py-main/examples/bulk-ingest/bulk-ingest.py", line 111, in <module>
    main()
  File "/Users/Downloads/elasticsearch-py-main/examples/bulk-ingest/bulk-ingest.py", line 102, in main
    for ok, action in bulk(
  File "/opt/homebrew/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 524, in bulk
    for ok, item in streaming_bulk(
  File "/opt/homebrew/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 438, in streaming_bulk
    for data, (ok, info) in zip(
  File "/opt/homebrew/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 355, in _process_bulk_chunk
    yield from gen
  File "/opt/homebrew/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 274, in _process_bulk_chunk_success
    raise BulkIndexError(f"{len(errors)} document(s) failed to index.", errors)
elasticsearch.helpers.BulkIndexError: 2 document(s) failed to index.

I cannot find any information on this, how can I index my data in bulk using the Elasticsearch Python connector?

CodePudding user response:

This is probably because when sending documents to a data stream you need to set the action to create instead of index

{ "create": {"_id": "123"}}
{ "field": "value" }

With the Python bulk helpers, you need to explicitly set '_op_type': 'create' in your bulk actions.

  • Related