I am looking for elastic search aggregation mapping
that will return the most common list for a certain field.
For example for docs:
{"ToneCurvePV2012": [1,2,3]}
{"ToneCurvePV2012": [1,5,6]}
{"ToneCurvePV2012": [1,7,8]}
{"ToneCurvePV2012": [1,2,3]}
I wish for the aggregation result: [1,2,3] (since it appears twice).
so far any aggregation that i made would return: 1
CodePudding user response:
This is not possible with default terms aggregation. You need to use terms aggregation with script. Please note that this might impact your cluster performance.
Here, i have used script which will create string from array and used it for aggregation. so if you have array value like [1,2,3]
then it will create string representation of it like '[1,2,3]'
and that key will be used for aggregation.
Below is sample query you can use to generate aggregation as you expected:
POST index1/_search
{
"size": 0,
"aggs": {
"tone_s": {
"terms": {
"script": {
"source": "def value='['; for(int i=0;i<doc['ToneCurvePV2012'].length;i ){value= value doc['ToneCurvePV2012'][i] ',';} value = ']'; value = value.replace(',]', ']'); return value;"
}
}
}
}
}
Output:
{
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"tone_s" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "[1,2,3]",
"doc_count" : 2
},
{
"key" : "[1,5,6]",
"doc_count" : 1
},
{
"key" : "[1,7,8]",
"doc_count" : 1
}
]
}
}
}
PS: key will be come as string and not as array in aggregation response.