I have a following format which has duplicate ID field.
ID PDE_ID Curency
1 21 USD 35
1 23 USD 34
2 25 CAD 43
3 26 INR 33
When there is a duplicate ID field , we need to pick the latest record by PDE_ID column, and with the result we need to do aggregation like Sum, Min, Max, value_count.
I tried top_result and max, but both doesn't support sub-aggregation. So i can do distinct and latest record, but cannot do any aggregation(sum/min/max/count) top of it.
Any help is much appreciated.
CodePudding user response:
Whoever is stuck, pls find the below query:
GET dispute-1-2022-04-*/_search?size=0
{
"aggs": {
"Duplicates": {
"terms": {
"field": "PDE.keyword"
},
"aggs": {
"div_id": {
"terms": {
"field": "PDE_DETAIL_ID.keyword",
"order": {
"_term": "desc"
},
"size": 1
},
"aggs": {
"individual_sum": {
"sum": {
"field": "DIV_ID"
}
}
}
},
"max_cal": {
"max_bucket": {
"buckets_path": "div_id>individual_sum"
}
}
}
},
"total_min": {
"max_bucket": {
"buckets_path": "Duplicates>max_cal"
}
}
}
}