In my index, I have documents like this:
{
"name": "name",
"createdAt": 1.6117508295E12
}
{
"name": "name1",
"createdAt": 1.6117508296E12
}
{
"name": "name",
"createdAt": 1.6117508297E12
}
I want to write a query in such a way so that I can compare between between the name field between any 2 documents and get unique results. The result should be like this:
{
"name": "name1",
"createdAt": 1.6117508296E12
}
{
"name": "name",
"createdAt": 1.6117508297E12
}
I am also using from and size in my elastic query. I have tried using collapse but that gives me less number of results as per the size.
I am using elastic 7.15.2
CodePudding user response:
You can simply use the terms aggregation with top_hits(with size=1, sorted by createdAt). Below is the working query on sample data your provided.
{
"size": 0,
"aggs": {
"unique": {
"terms": {
"field": "name.keyword"
},
"aggs": {
"unique_names": {
"top_hits": {
"sort": [
{
"createdAt": {
"order": "asc"
}
}
],
"size": 1
}
}
}
}
}
}
And search result
"aggregations": {
"unique": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "name",
"doc_count": 2,
"unique_names": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "71625371",
"_id": "1",
"_score": null,
"_source": {
"name": "name",
"createdAt": 1.6117508295E12
},
"sort": [
1.61175083E12
]
}
]
}
}
},
{
"key": "name1",
"doc_count": 1,
"unique_names": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "71625371",
"_id": "2",
"_score": null,
"_source": {
"name": "name1",
"createdAt": 1.6117508296E12
},
"sort": [
1.61175083E12
]
}
]
}
}
}
]
}
}