Home > Mobile >  Elasticsearch sorting with specific importance for field values
Elasticsearch sorting with specific importance for field values

Time:10-20

I am using Java and Spring data, Elasticsearch 6.8.14 Api. to communicate with Elasticsearch. I have index that returns such data (I am including this search result to show the mapping structure also)

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "rgt",
        "_type" : "carindexeddata",
        "_id" : "6020354",
        "_score" : 1.0,
        "_source" : {
          "id" : "4441",
          "version" : null,
          "carId" : "1263",
          "mark" : "ford",
          "colour" : "green",
          "status" : "Approved",
        
......

So basically I store cars. Now I need to sort them before returning to the user. I have to sort it:

 - mark
 - colour (within same mark colours are important)
 - status

And as for the status the sort order should be as follows:

1. BOUGHT
2. IN PRODUCTION
3. IN TESTS
4. APPROVED

So having such cars order would be OK:

1. Ford Black Bought
2. Ford Black Approved
3. Ford White Bought
4. GMC White Bought
5. GMC White Approved

Which mechanism in Elasticsearch could I use to sort items that way ? Is it possible to implement? Can u show some example ? Sorting by fields mark, colour, status is not correct because there is some custom logic in status sorting - it is not letter sorting but some weight sorting I would say.. but how to give specific weights for specific statuses in elasticesearch? Should I store a field with some number for each status in Elastic search and sort according to this number field instead status field directly ?

CodePudding user response:

For status field, you can use script sort

{
  "sort": [
    {
      "mark.keyword": "asc",
      "colour.keyword": "asc",
      "_script": {
        "type": "number",
        "script": {
          "lang": "painless",
          "source": """
                    if(doc['status.keyword'].value.toUpperCase()=="BOUGHT")
                            return 1;
                    else if(doc['status.keyword'].value.toUpperCase()=="IN PRODUCTION")
                            return 2;
                    else if(doc['status.keyword'].value.toUpperCase()=="IN TESTS")
                            return 3;
                    else return 4;
                    """
        },
        "order": "asc"
      }
    }
  ]
}

Scripts are slow.

Elastic search works best when data is preprocessed. If you can have a numeric field which represents status value, performance will be better. You need to check it out what works best for your case.

  • Related