Home > OS >  Elastic search filter documents that contain array with empty string
Elastic search filter documents that contain array with empty string

Time:02-11

I've documents in elastic search and I want to filter out the documents that contain an array of only empty strings or have nothing / empty array.

#doc 1
{
  "_index": "my-index-000001",
  "_type": "_doc",
  "_id": "0",
  "_source": {
    "doc":{
        "field": ["",""]
    }
  }
}

#doc 2
{
  "_index": "my-index-000001",
  "_type": "_doc",
  "_id": "0",
  "_source": {
    "doc":{
        "field": []
    }
  }
}

#doc 3
{
  "_index": "my-index-000001",
  "_type": "_doc",
  "_id": "0",
  "_source": {
    "doc":{
        "field": ["hello",""]
    }
  }
}

From the above documents is it possible to filter out only doc 1 and doc 2 as for these, the "field" either contains nothing in the array or only empty string(s).

CodePudding user response:

Please check below query which will return only the document which have empty array or an array with all the empty string.

here first should clause will check if empty string is part of array or not, second clause will check if array field does not exist and must_not with wildcard will remove document from result which have atleast one element in array.

{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "city.keyword": {
              "value": ""
            }
          }
        },
        {
          "bool": {
            "must_not": [
              {
                "exists": {
                  "field": "city.keyword"
                }
              }
            ]
          }
        }
      ],
      "must_not": [
        {
          "wildcard": {
            "city.keyword": "?*"
          }
        }
      ]
    }
  }
}

Below is sample document in my index :

{
"hits" : [
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "4g3P2H4BrzeQ9ErqJwUL",
        "_score" : 1.0,
        "_source" : {
          "city" : [
            "",
            ""
          ]
        }
      },
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "4w3P2H4BrzeQ9ErqXgWT",
        "_score" : 1.0,
        "_source" : {
          "city" : [ ]
        }
      },
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "5A3P2H4BrzeQ9ErqhwUI",
        "_score" : 1.0,
        "_source" : {
          "city" : [
            "hello",
            ""
          ]
        }
      },
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "5Q3q2H4BrzeQ9ErqOAXW",
        "_score" : 1.0,
        "_source" : {
          "city" : [
            "hello",
            "sagar"
          ]
        }
      }
    ]
}

Sample output after executing above query:

{
"hits" : [
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "4g3P2H4BrzeQ9ErqJwUL",
        "_score" : 0.5619608,
        "_source" : {
          "city" : [
            "",
            ""
          ]
        }
      },
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "4w3P2H4BrzeQ9ErqXgWT",
        "_score" : 0.0,
        "_source" : {
          "city" : [ ]
        }
      }
    ]
}
  • Related