Home > Back-end >  How to get the sum of a value using an id condition over a date range with Elasticsearch?
How to get the sum of a value using an id condition over a date range with Elasticsearch?

Time:10-29

I'm trying to write a query to get the sum of a value per month of documents with a particular Id. To do this I'm trying:

query = {
    "size": 0,
    "aggs" : {
        "articles_over_time" : {
            "date_histogram" : {
                "field" : "timestamp",
                "interval" : "month"
            }
        },
        "value": {
            "sum": {
                "field": "generatedTotal"
            }
        }
    }
}  

This query will give me the sum of generatedTotal per month, but it is giving me the sum of generatedTotal for all documents. How can I specify to get the sum of generatedTotal per month for a particular generatorId?

Example of a document in the Elasticsearch index:

{'id': 0, 'timestamp': '2018-01-01', 'generatorId': '150', 'generatedTotal': 2166.8759558092734}

CodePudding user response:

If you do it separately like that, it counts as 2 different aggregations. You first need to query for the specific generatorId that you want, then do the second aggs within the first aggs:

{
    "size": 0,
    "query": {
        "term": {
            "generatorId": "150"
        }
    },
    "aggs": {
        "articles_over_time": {
            "date_histogram": {
                "field": "timestamp",
                "interval": "month"
            },
            "aggs": {
                "monthlyGeneratedTotal": {
                    "sum": {
                        "field": "generatedTotal"
                    }
                }
            }
        }
    }
}

4 sample documents (1 with different generatorId, and not be counted in the aggregations)

{"timestamp": "2018-02-01", "generatedTotal": 3, "generatorId": "150"}
{"timestamp": "2018-01-01", "generatedTotal": 1, "generatorId": "150"}
{"timestamp": "2018-01-01", "generatedTotal": 2, "generatorId": "150"}
{"timestamp": "2018-01-01", "generatedTotal": 2, "generatorId": "160"}

Then you will have the aggregations as follow:

{
"aggregations": {
    "articles_over_time": {
      "buckets": [
        {
          "key_as_string": "2018-01-01T00:00:00.000Z",
          "key": 1514764800000,
          "doc_count": 2,
          "monthlyGeneratedTotal": {
            "value": 3.0
          }
        },
        {
          "key_as_string": "2018-02-01T00:00:00.000Z",
          "key": 1517443200000,
          "doc_count": 1,
          "monthlyGeneratedTotal": {
            "value": 3.0
          }
        }
      ]
    }
  }
}

I hope this answers your question.

  • Related