Home > Blockchain >  Fetch the details of events occurred exactly x times in desired duration
Fetch the details of events occurred exactly x times in desired duration

Time:06-02

In ElasticSearch, I need to fetch the records only if the Event name occurred exactly x times in n days or a particular duration.

Sample index data is as below:

{"event":{"name":"event1"},"timestamp":"2010-06-20"}

I'm able to get the records of the minimum occurrence of desired event name in a particular duration. But instead of minimum, I want the exact matching count. Here's what I tried:

{
  "_source": true,
  "size": 0, 
  "query": { 
    "bool": {
      "filter":
      {
        "range": { "timestamp": { "gte": "2010", "lte": "2016" }}
      },
      "must":
      [
        { "match": { "event.name.keyword": "event1" }}
      ]
    }
  },
  "aggs": {
    "occurrence": {
      "terms": {
        "field": "event.name.keyword",
        "min_doc_count": 5,
        "size": 10
      }
    }
  }
}

Another way to achieve the same is by using value_count. But here as well, I'm unable to add a condition to match exact occurrences.

{
  "_source": true,
  "size": 0, 
  "query": { 
    "bool": {
      "filter":
      {
        "range": { "timestamp": { "gte": "2010", "lte": "2016" }}
      },
      "must":
      [
        { "match": { "event.name.keyword": "event1" }}
      ]
    }
  },
  "aggs": {
    "occurrence": {
      "value_count": {
        "field": "event.name.keyword"
      }
    }
  }
}

It provides the output as (Other output is removed for brevity):

  "aggregations" : {
    "occurrence" : {
      "value" : 2
    }
  }

But I need to add a condition in the output of aggr (occurrence here) to exactly match the occurrence so that I can get the records only if the event occurred exactly x times.

Can some ES experts help me on this?

CodePudding user response:

You can use Bucket Selector Aggregation and add condition as shown below for the count. Below query will give you only event which is occurs total 5 times. You can add a query clause for whatever filter you want to apply like date range or event name or anything else.

{
  "size": 0,
  "aggs": {
    "count": {
      "terms": {
        "field": "event.name.keyword",
        "size": 10
      },
      "aggs": {
        "val_count": {
          "value_count": {
            "field": "event.name.keyword"
          }
        },
        "selector": {
          "bucket_selector": {
            "buckets_path": {
              "my_var1": "val_count"
            },
            "script": "params.my_var1 == 5"
          }
        }
      }
    }
  }
}

You will get result something like below:

"aggregations" : {
    "count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "event1",
          "doc_count" : 5,
          "val_count" : {
            "value" : 5
          }
        },
        {
          "key" : "event8",
          "doc_count" : 5,
          "val_count" : {
            "value" : 5
          }
        }
      ]
    }
  }
  • Related