Home > Blockchain >  Distinct query in ElasticSearch
Distinct query in ElasticSearch

Time:11-30

I've an index where a field (category) is a list field. I want to fetch all the distinct categories within in an index.

Following is the example.

Doc1 - 
{
    "category": [1,2,3,4]
}


Doc2 - 
{
    "category": [5,6]
}


Doc3 - 
{
    "category": [1,2,3,4]
}


Doc4 - 
{
    "category": [1,2,7]
}

My output should be

[1,2,3,4]
[5,6]
[1,2,7]

I using the below query:-

GET /products/_search
{
"size": 0,
"aggs" : {
    "category" : {
        "terms" : { "field" : "category",  "size" : 1500 }
    }
}}

This returns me [1], [2], [3], [4], [5], [6], [7]. I don't want the individual unique items in my list field. I'm rather looking for the complete unique list.

What am I missing in the above query? I'm using ElasticSearch v7.10

CodePudding user response:

You can use terms aggregation with script:

{
  "size": 0,
  "aggs": {
    "category":{
      "terms": {
       "script": {
         "source": """
         def cat="";
         for(int i=0;i<doc['category'].length;i  ){
           cat =doc['category'][i];} 
           return cat;
           """
       }
      }
    }
  }
}

Above query will return result like below:

"aggregations": {
    "category": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "1234",
          "doc_count": 2
        },
        {
          "key": "127",
          "doc_count": 1
        },
        {
          "key": "56",
          "doc_count": 1
        }
      ]
    }
  }
  • Related