I'm not sure if this is possible inside _bulk
and I don't know the exact syntax to be used here but I'd like to create _id
that is combination of few fields from document which is at the end hashed.
so something like this (note: see the _id
attribute):
POST /_bulk
{"index":{"_index":"eocs-technical-2022.08.24","_type":"_doc", "_id": "${hash(doc['@timestamp'] doc['message'] doc['instance_id'])}"}}
{"@timestamp":"2022-08-24T13:49:34.428 0200","message":"This is testing message","hostname":"testcomputer.local","ip":"-","service_name":"test-service","instance_id":"c0","build.version":"master-d723731300570fd1b2d241c4849b223673d1c8d8","source":"com.example.ELKTest","level":"DEBUG","thread_name":"scheduler-1"}
Is that possible?
Thanks
CodePudding user response:
It's possible to do it using an ingest pipeline with a fingerprint
processor, like this:
PUT _ingest/pipeline/id-hasher
{
"processors": [
{
"fingerprint": {
"target_field": "_id",
"fields": [
"@timestamp",
"message",
"instance_id"
]
}
}
]
}
And then you can simply reference that pipeline in your bulk call
POST /_bulk?pipeline=id-hasher
{"index":{"_index":"eocs-technical-2022.08.24","_type":"_doc", "_id": "dummy"}}
{"@timestamp":"2022-08-24T13:49:34.428 0200","message":"This is testing message","hostname":"testcomputer.local","ip":"-","service_name":"test-service","instance_id":"c0","build.version":"master-d723731300570fd1b2d241c4849b223673d1c8d8","source":"com.example.ELKTest","level":"DEBUG","thread_name":"scheduler-1"}
The generated id for the sample message above will be A3t5JHZE4ejqYoxEkfrnyTKBfFY
CodePudding user response:
Tldr;
This is not possible using the bulk API.
This should be done by the app uploading the data.