I am trying to load/ingest data from some log files that is almost a replica of what data is stored in some 3rd vendor's DB. The data is pipe separated "key-value" values and I am able split it up using kv filter plugin in logstash.
Sample data -
1.) TABLE="TRADE"|TradeID="1234"|Qty=100|Price=100.00|BuyOrSell="BUY"|Stock="ABCD Inc."
if we receive modification on the above record,
2.) TABLE="TRADE"|TradeID="1234"|Qty=120|Price=101.74|BuyOrSell="BUY"|Stock="ABCD Inc."
We need to update the record that was created on the first entry. So, I need to make the TradeID as id field and need to upsert the records so there is no duplication of same TradeID record.
Code for logstash.conf is somewhat like below -
input {
file {
path => "some path"
}
}
filter {
kv {
source => "message"
field_split => "\|"
value_split => "="
}
}
output {
elasticsearch {
hosts => ["https://localhost:9200"]
cacert => "path of .cert file"
ssl => true
ssl_certificate_verification => true
index => "trade-index"
user => "elastic"
password => ""
}
}
CodePudding user response:
You need to update your elasticsearch
output like below:
output {
elasticsearch {
hosts => ["https://localhost:9200"]
cacert => "path of .cert file"
ssl => true
ssl_certificate_verification => true
index => "trade-index"
user => "elastic"
password => ""
# add the following to make it work as an upsert
action => "update"
document_id => "%{TradeID}"
doc_as_upsert => true
}
}
So when Logstash reads the first trade, the document with ID 1234 will not exist and will be upserted (i.e. created). When the second trade is read, the document exists and will be simply updated.