Do you use Elastic and Metricbeats for process monitoring and alerting? How did you configure your data gathering and alerting?
I am currently trying to set this up, and running into some basic issues. These issues are making me question whether Elastic is a suitable tool for alerting. Here is my planned setup:
- Use Metricbeats to gather process data
- Create an Elastic dashboard/lens for certain processes
- If the
process.cpu.start_time
from Metricbeats is very young (e.g. it has only been running for under 5 minutes), alert!
I have been working my way through this using the following approach:
- From Metricbeats, the processes include
process.cpu.start_time
, as a text string in ISO date format. Elastic lens queries are very limited with dates. - Workaround: use Logstash to create a filter field
process.cpu.start_epoch
, which is an integer - the Unix epoch: "seconds since January 1, 1970". - Create a dashboard lens, querying only my process, and only the
last
metric. This works and gives me "the time that the process started, as a Unix epoch". - I next need to calculate the time difference between
now
and that integer. However I don't see anything in the lens documentation about doing date math. So I'm stuck.
The difficulties I am encountering are making me wonder if I am "doing it wrong"? Is Elastic/Metricbeats a suitable tool for what I am trying to achieve?
CodePudding user response:
Answer: find the right hammer!
What I needed is called "Elastic runtime fields". There's a step-by-step writeup here: https://elastic-content-share.eu/elastic-runtime-field-example-repository/
Summary:
- open index
- click the "dots"
- choose "add field to index pattern"
- set output field name as desired
- for me this is
process.cpu.start.age
- for me this is
- set output type
- for me this is "long"
- write your script in "painless"
- for me this is
emit(Date().getTime() - doc['process.cpu.start'].value.toEpochMilli());
- for me this is
PS: I deleted my logstash filters, because they were superfluous.