Home > Back-end >  Elasticsearch: order by length of array properties
Elasticsearch: order by length of array properties

Time:09-29

im using Kibana and ES, i have an index with objects and an array called "reviews" that inside has properties called positive_comment, negative_comment, reviewer_name and more.

There are some reviews that have the field positive_comment empty (not null, just empty).

I need to order by the length of the field positive_comment first, so reviews with empty positive_comment comes last. The results are the same when ordering in SQL with LEN() property.

This is my query. I also tried: .value.size() in the script, or type "string" but no results.

{
   "_source":[
      "reviews.positive_comment"
   ],
   "query":{
      "bool":{
         "filter":[
            {
               "term":{
                  "id":214
               }
            }
         ]
      }
   },
   "sort":{
      "_script":{
         "script":"doc['reviews.positive_comment'].value.length()",
         "type":"number",
         "order":"asc"
      }
   }
}

This is my result:

{
   "_source":{
      "reviews":[
         {
            "positive_comment":"Great"
         },
         {
            "positive_comment":"Really good product"
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":""
         },
         {
            "positive_comment":"Awesome"
         }
      ]
   },
   "sort":[
      "0"
   ]
}

CodePudding user response:

Elasticsearch doesn't support counting array elements, unless you use a script.

However, running a script for every matching document is going to degrade performance for every search query.

A better solution would be to count the values once, at index-time, and store the counts in dedicated fields (positive_comments_count, negative_comments_count, etc.) and use these fields for sorting.

CodePudding user response:

Try following. It worked for me on ES 5.6.3. So, should work on higher versions too.

GET test/test/_search?filter_path=hits.hits
{
  "query": {
    "match_all": {}
  },
  "sort": {
    "_script": {
      "type": "number",
      "script": {
        "lang": "painless",
        "source": "doc['reviews.positive_comment'].value.length()"
      },
      "order": "asc"
    }
  }
}

I have tested on ES 7.1 version too.

Mapping

PUT test
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "somefield": {
        "type": "keyword"
      },
      "reviews": {
        "properties": {
          "positive_comment": {
            "type": "keyword"
          },
          "item_id": {
            "type": "double"
          }
        }
      }
    }
  }
}

Query:

GET test/_search
{
  "query": {
    "match_all": {}
  },
  "_source": "reviews.positive_comment", 
  "sort": {
    "_script": {
      "type": "number",
      "script": {
        "lang": "painless",
        "source": "doc['reviews.positive_comment'].value.length() % 100"
      },
      "order": "asc"
    }
  }
}

output

{
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "test5",
        "_score" : null,
        "_source" : {
          "reviews" : [
            {
              "positive_comment" : ""
            }
          ]
        },
        "sort" : [
          0.0
        ]
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "test1",
        "_score" : null,
        "_source" : {
          "reviews" : [
            {
              "positive_comment" : "Awesome"
            }
          ]
        },
        "sort" : [
          7.0
        ]
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "test3",
        "_score" : null,
        "_source" : {
          "reviews" : [
            {
              "positive_comment" : "What a product"
            }
          ]
        },
        "sort" : [
          14.0
        ]
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "test2",
        "_score" : null,
        "_source" : {
          "reviews" : [
            {
              "positive_comment" : "What a product.. amazing"
            }
          ]
        },
        "sort" : [
          24.0
        ]
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "test4",
        "_score" : null,
        "_source" : {
          "reviews" : [
            {
              "positive_comment" : "Thats a great product.. "
            }
          ]
        },
        "sort" : [
          24.0
        ]
      }
    ]
  }
}
  • Related