Home > Back-end >  Upsert documents in Elasticsearch using custom ID field
Upsert documents in Elasticsearch using custom ID field

Time:03-31

I am trying to load/ingest data from some log files that is almost a replica of what data is stored in some 3rd vendor's DB. The data is pipe separated "key-value" values and I am able split it up using kv filter plugin in logstash.

Sample data -

1.) TABLE="TRADE"|TradeID="1234"|Qty=100|Price=100.00|BuyOrSell="BUY"|Stock="ABCD Inc."

if we receive modification on the above record,

2.) TABLE="TRADE"|TradeID="1234"|Qty=120|Price=101.74|BuyOrSell="BUY"|Stock="ABCD Inc."

We need to update the record that was created on the first entry. So, I need to make the TradeID as id field and need to upsert the records so there is no duplication of same TradeID record.

Code for logstash.conf is somewhat like below -

input {
  file {
    path => "some path"
  }
}

filter {
  kv {
    source => "message"
    field_split => "\|"
    value_split => "="
  }
}

output {
  elasticsearch {
    hosts => ["https://localhost:9200"]
    cacert => "path of .cert file"
    ssl => true
    ssl_certificate_verification  => true
    index => "trade-index"
    user => "elastic"
    password => ""
  }
}

CodePudding user response:

You need to update your elasticsearch output like below:

output {
  elasticsearch {
    hosts => ["https://localhost:9200"]
    cacert => "path of .cert file"
    ssl => true
    ssl_certificate_verification  => true
    index => "trade-index"
    user => "elastic"
    password => ""

    # add the following to make it work as an upsert
    action => "update"
    document_id => "%{TradeID}"
    doc_as_upsert => true
  }
}

So when Logstash reads the first trade, the document with ID 1234 will not exist and will be upserted (i.e. created). When the second trade is read, the document exists and will be simply updated.

  • Related