Home > Enterprise >  Elasticsearch - search for multiple values in a list of nested values
Elasticsearch - search for multiple values in a list of nested values

Time:02-01

I'm trying to get the the documents that match all the itens inside a list, the field that I'm searching for is inside a list of nested :

map of my index:

PUT testindex1
{
  "mappings": {
    "properties": {
    "patients": {
      "type": "nested",
      "properties": {
        "name": {
          "type": "keyword"
        },
        "age": {
          "type": "keyword"
        }
      }
    }
  }
  }
}

Documents

PUT testindex1/_doc/1
{ 
  "patients": [ 
    {"name" : "1", "age" : "1"},
    {"name" : "1", "age" : "2"},
    {"name" : "1", "age" : "3"}
  ] 
}

PUT testindex1/_doc/2
{ 
  "patients": [ 
    {"name" : "1", "age" : "1"},
    {"name" : "1", "age" : "2"},
    {"name" : "1", "age" : "3"}
  ] 
}

PUT testindex1/_doc/3
{ 
  "patients":[ 
    {"name" : "1", "age" : "2"},
    {"name" : "1", "age" : "5"},
    {"name" : "1", "age" : "4"}
  ] 
}

what I'm trying to get is all the documents where the patients ages are inside have list ["2", "1"], in this case only the document 1 and 2. I know that i can update the map by using this approach

But this would mean that I would have to reprocess the entire dataset

get patients that have both ages "1" and "2" (only patients of index 1 and 2)

CodePudding user response:

You can use a nested query in your search request to search within the nested "patients" field.

Here's an example using a bool query with must clause and nested query:

GET testindex1/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "patients",
            "query": {
              "bool": {
                "should": [
                  {
                    "term": {
                      "patients.age": "1"
                    }
                  },
                  {
                    "term": {
                      "patients.age": "2"
                    }
                  }
                ],
                "minimum_should_match": 1
              }
            }
          }
        }
      ]
    }
  }
}

This query result gives me both 1 2 3. Because all documents have patient.age 1 or 2. So I updated the 3 documents like the following.

PUT testindex1/_doc/3
{ 
  "patients":[ 
    {"name" : "1", "age" : "3"},
    {"name" : "1", "age" : "5"},
    {"name" : "1", "age" : "4"}
  ] 
}

After the update, the above query returned only 2 documents.

I hope I understand your question correctly.

CodePudding user response:

I've found the answer here : Search a nested field for multiple values on the same field with elasticsearch

Basicaly you need to search via a nested must :

GET testindex1/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "nested": {
            "path": "patients",
            "query": {
              "bool": {
                "filter": [
                  {
                    "match": {
                      "patients.age": "2"
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "nested": {
            "path": "patients",
            "query": {
              "bool": {
                "filter": [
                  {
                    "match": {
                      "patients.age": "1"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

This returns only the patients that have age 1 and age 2, returning the following output :

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "testindex1",
        "_id" : "1",
        "_score" : 0.0,
        "_source" : {
          "patients" : [
            {
              "name" : "1",
              "age" : "1"
            },
            {
              "name" : "1",
              "age" : "2"
            },
            {
              "name" : "1",
              "age" : "3"
            }
          ]
        }
      },
      {
        "_index" : "testindex1",
        "_id" : "2",
        "_score" : 0.0,
        "_source" : {
          "patients" : [
            {
              "name" : "1",
              "age" : "1"
            },
            {
              "name" : "1",
              "age" : "2"
            },
            {
              "name" : "1",
              "age" : "3"
            }
          ]
        }
      }
    ]
  }
}
  • Related