Home > Net >  Search for flattened field existence in ElasticSearch
Search for flattened field existence in ElasticSearch

Time:10-13

I'm storing an arbitrary nested object as a flattened field "_meta" which contains various information related to a product. Here is the mapping for that field:

"mappings": {
        "dynamic": "strict",
        "properties": {
            "_meta": {
                "type": "flattened"
            },
            ...

So when trying to search for:

{
    "query": {
        "exists": {
            "field": "_meta.user"
        }
    }
}

I'm expecting to retrieve all documents that have that field populated. I get zero hits, although if I search for a particular document, I can see that at least one document has that field populated:

"user": {
  "origin_title": "some title",
  "origin_title_en": "some other title",
  "address": "some address",
  "performed_orders_count": 0,
  "phone": "some phone",
  "name": "some name",
  "tariff": null,
  "proposal_image_background_color": null
},

So how exactly does searching through a flattened data field work? Why I'm not getting any results?

CodePudding user response:

Tldr;

It is because of the way flattened fields work.

In your case:

{
    "_meta":{
        "user": {
            "name": "some name"
        }
    }
}

Elasticsearch available representation are:

{
    "_meta": ["some name"],
    "_meta.user.name": "some name"
}

To reproduce

For the set up:

PUT /74025685/
{
  "mappings": {
    "dynamic": "strict",
    "properties": {
      "_meta":{
        "type": "flattened"
      }
    }
  }
}

POST /_bulk
{"index":{"_index":"74025685"}}
{"_meta":{"user": "some user"}}
{"index":{"_index":"74025685"}}
{"_meta":{"user": null, "age": 10}}
{"index":{"_index":"74025685"}}
{"_meta":{"user": ""}}
{"index":{"_index":"74025685"}}
{"_meta":{"user": {"username": "some user"}}}

This query is going to find 2 records:

GET 74025685/_search
{
  "query": {
    "term": {
      "_meta": {
        "value": "some user"
      }
    }
  }
}

This one, is only going to match the first documents:

GET 74025685/_search
{
  "query": {
    "term": {
      "_meta.user": {
        "value": "some user"
      }
    }
  }
}

And so for the exist query:

This one will only return the last doc.

GET 74025685/_search
{
  "query": {
    "exists": {
      "field": "_meta.user.username"
    }
  }
}

Whereas this one os going to return the 1st and 3rd:

GET 74025685/_search
{
  "query": {
    "exists": {
      "field": "_meta.user"
    }
  }
}
  • Related