I am using the Java API for elasticsearch and I am trying to get only the last version (which is a custom field) of each document when executing a search.
For example :
{ id: 1, name: "John Greenwood", version: 1}
{ id: 1, name: "John Greenwood", version: 2}
{ id: 2, name: "John Underwood", version: 1}
While searching with Jhon, I want this result :
{ id: 1, name: "John Greenwood", follower_count: 2}
{ id: 2, name: "John Underwood", follower_count: 1}
Apparently I am supposed to use aggregation, but Im not sure how to use them with the Java API. Also, how can I regroup the documents with the ID also ? Because I only want the latest version for the same ID
CodePudding user response:
Tldr;
Yes, you are on the right track.
You will want to aggregate on the id
of each user. The get the top_hit
per regard to the version.
Solution
The first aggregation per_id
is grouping user by their id
, then inside this aggregation we perform another one.
lastest_version
that is going to select the best hit with regards to the version. I select the size: 1
to get a top 1 per group.
GET 74550367/_search
{
"query": {
"match_all": {}
},
"aggs": {
"per_id": {
"terms": {
"field": "id"
},
"aggs": {
"lastest_version": {
"top_hits": {
"sort": [
{
"version": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}
To Reproduce
POST _bulk
{ "index": {"_index":"74550367"}}
{ "id": 1, "name": "John Greenwood", "version": 1}
{ "index": {"_index":"74550367"}}
{ "id": 1, "name": "John Greenwood", "version": 2}
{ "index": {"_index":"74550367"}}
{ "id": 2, "name": "John Underwood", "version": 1}