I have a question about aggregation.
I want to do aggregation for a field declared as an object array. It is not aggregation for each element, but aggregation for the whole value.
I have following documents:
PUT value-list-index
{
"mappings": {
"properties": {
"server": {
"type": "keyword"
},
"users": {
"type": "keyword",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
PUT value-list-index/_doc/1
{
"server": "server1",
"users": ["user1"]
}
PUT value-list-index/_doc/2
{
"server": "server2",
"users": ["user1","user2"]
}
PUT value-list-index/_doc/3
{
"server": "server3",
"users": ["user2", "user3"]
}
PUT value-list-index/_doc/4
{
"server": "server4",
"users": ["user1","user2", "user3","user4"]
}
PUT value-list-index/_doc/5
{
"server": "server5",
"users": ["user2", "user3","user4"]
}
PUT value-list-index/_doc/6
{
"server": "server6",
"users": ["user3","user4"]
}
PUT value-list-index/_doc/7
{
"server": "server7",
"users": ["user1","user2", "user3","user4"]
}
PUT value-list-index/_doc/8
{
"server": "server8",
"users": ["user1","user2", "user3","user4"]
}
PUT value-list-index/_doc/9
{
"server": "server9",
"users": ["user1","user2", "user3","user4"]
}
get value-list-index/_search
{
"size" : 0,
"aggs": {
"words": {
"terms": {
"field": "users"
},
"aggs": {
"total": {
"value_count": {
"field": "users"
}
}
}
}
}
}
i want following
"aggregations" : {
"words" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
**"key" : "user1",
"doc_count" : 1,**
"total" : {
"value" : xx
}
},
{
**"key" : "user1","user2",
"doc_count" : 1,**
"total" : {
"value" : xx
}
},
{
"key" : "user1","user2","user3","user4",
"doc_count" : 4,
"total" : {
"value" : xx
}
}
]
}
}
but return each element grouping result like this
"aggregations" : {
"words" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "user2",
"doc_count" : 7,
"total" : {
"value" : 23
}
},
{
"key" : "user3",
"doc_count" : 7,
"total" : {
"value" : 23
}
},
{
"key" : "user1",
"doc_count" : 6,
"total" : {
"value" : 19
}
},
{
"key" : "user4",
"doc_count" : 6,
"total" : {
"value" : 21
}
}
]
}
}
Is the aggregation I want possible?
CodePudding user response:
Maybe this aggs can help you: Frequent items aggregation
But be careful with the performance.
Look this results:
"aggregations": {
"words": {
"buckets": [
{
"key": {
"users": [
"user2"
]
},
"doc_count": 7,
"support": 0.7777777777777778
},
{
"key": {
"users": [
"user2",
"user3"
]
},
"doc_count": 6,
"support": 0.6666666666666666
},
{
"key": {
"users": [
"user3",
"user4"
]
},
"doc_count": 6,
"support": 0.6666666666666666
},
{
"key": {
"users": [
"user1"
]
},
"doc_count": 6,
"support": 0.6666666666666666
},
{
"key": {
"users": [
"user2",
"user3",
"user4"
]
},
"doc_count": 5,
"support": 0.5555555555555556
},
{
"key": {
"users": [
"user2",
"user1"
]
},
"doc_count": 5,
"support": 0.5555555555555556
},
{
"key": {
"users": [
"user2",
"user3",
"user4",
"user1"
]
},
"doc_count": 4,
"support": 0.4444444444444444
}
]
}
}