Home > Enterprise >  ElasticSearch - how to get aggregation of aggregation
ElasticSearch - how to get aggregation of aggregation

Time:03-24

I am very new to elasticsearch I work for a dating website that has data as follows: Single - with fields: name, signUpDate, state, and other data fields.

Encounter - with fields: state, encounterDate, singlesInvolved, and other data fields.

These are my 2 indexes

Now I have to write a query that returns as follows:

For every state, how many singles, how many encounters, the longest time a single has been part of our website, and the average time a single has been part of our website

And also return one result that is that same average for all states Like this example:

[
 { //this one is the average of all states
    "singles": 45,
    "dates": 18,
    "minWaitingTime": 1644677979530,
    "avgWaitingTime": 15603
  },
  { //these are the averages of each state
    "state": "MA",
    "singles": 50,
    "dates": 23,
    "minWaitingTime": 1644677979530,
    "avgWaitingTime": 15603
  },
  {
    "state": "NY",
    "singles": 39,
    "dates": 13,
    "minWaitingTime": 1644850558872,
    "avgWaitingTime": 6033
  }
]

I've been working on the query for each state individually but i dont know how to get an average of all states

so far what i have is this:

GET /single,encounter/_search 
{
  "size": 0, 
  "aggs": {
    "bystate": {
      "terms": {
        "field": "state",
        "size": 59
      },
      "aggs": {
        "group-by-index": {
          "terms": {
            "field": "_index"
          }
        },
        "min_date": {
          "min": {
            "field": "signedUpAt"
          }
        },
        "avg_date": {
          "avg": {
            "field": "signedUpAt"
          }
        }
      }
    }
  }
}

I don't know if there is a better way to do this, likewise I don't know how to calculate the average (singles, encounters, min_date and average_date average) for all states using this result Every result of the previous query looks like this:

{
      "key" : "MA",
      "doc_count" : 164,
      "avg_date" : {
        "value" : 1.6457900076508965E12,
        "value_as_string" : "2022-02-25T11:53:27.650"
      },
      "min_date" : {
        "value" : 1.64467797953E12,
        "value_as_string" : "2022-02-12T14:59:39.530"
      },
      "group-by-index" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "single",
            "doc_count" : 135
          },
          {
            "key" : "encounter",
            "doc_count" : 29
          }
        ]
      }
    },

I would really appreciate help on this one

Addition: index mapping. Encounter:

{
  "encounter" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        },
        "avgAge" : {
          "type" : "integer",
          "index" : false,
          "doc_values" : false
        },
        "application" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "createdAt" : {
          "type" : "date",
          "format" : "date_hour_minute_second_millis"
        },
        "encounterId" : {
          "type" : "keyword"
        },
        "locationType" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        },
        "singleOneId" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        },
        "singleTwoId" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        },
        "serviceLine" : {
          "type" : "keyword"
        },
        "state" : {
          "type" : "keyword"
        },
        "rating" : {
          "type" : "keyword"
        }
      }
    },
    "settings" : {
      "index" : {
        "refresh_interval" : "1s",
        "number_of_shards" : "1",
        "provided_name" : "encounter",
        "creation_date" : "1643704661932",
        "number_of_replicas" : "1",
        "uuid" : "MliXQL_bRBKDN7_d8G_BYw",
        "version" : {
          "created" : "7100299"
        }
      }
    }
  }
}

And Single:

{
  "single" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        },
        "id" : {
          "type" : "keyword"
        },
        "singleId" : {
          "type" : "keyword"
        },
        "state" : {
          "type" : "keyword"
        },
        "preferedGender" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "refresh_interval" : "1s",
        "number_of_shards" : "1",
        "provided_name" : "single",
        "creation_date" : "1643704662136",
        "number_of_replicas" : "1",
        "uuid" : "Js_tqZfRRx-IxbjVRRN4wQ",
        "version" : {
          "created" : "7100299"
        }
      }
    }
  }
}

CodePudding user response:

You can use avg bucket aggregation, where you can provide bucket_path and based on value it will calculate avg of entire aggregation.

Below is sample query:

{
  "size": 0,
  "aggs": {
    "bystate": {
      "terms": {
        "field": "state",
        "size": 59
      },
      "aggs": {
        "group-by-index": {
          "terms": {
            "field": "_index"
          }
        },
        "min_date": {
          "min": {
            "field": "signedUpAt"
          }
        },
        "avg_date": {
          "avg": {
            "field": "signedUpAt"
          }
        }
      }
    },
    "avg_all_state": {
      "avg_bucket": {
        "buckets_path": "bystate>avg_date"
      }
    }
  }
}
  • Related