Home > Back-end >  Schema to accept array with object or string
Schema to accept array with object or string

Time:09-15

Need to store array data in Elasticsearch. The array can contain a string or object so I can accept both array types. I'm able to do this by defining individual array types. But I need a generic solution that accepts strings or objects in an array.

Object to store in Elasticsearch :

{
  "anotherData": [
     {
        "someData": {
            "testingName": [
                {
                    "alternateName": "Another data",
                    "areaCategory": "some data"
                }
            ]
        }
     }
   ]
}

Elasticsearch schema for storing above data :

{
  ....
  "mappings": {
    "properties": {
      "testingName": {
        "properties": {
          "alternateName": {
            "type": "keyword"
          },
          "areaCategory": {
            "type": "keyword"
          }
        }
      },
      ....
    }
  }
}

Another example of an object to store :

{
"anotherData": [
    {
        "someData": {
            "testingName": [
                "this is only a array string"
            ]
        }
    }
  ]
}

For the above object the Elasticsearch schema would be :

{
  ....
  "mappings": {
    "properties": {
      "testingName": {
        "type": "keyword"
      },
      ....
    }
  }
}

I need to combine both in the schema as "any of one" condition. I can store any of the above two objects which could be an array of strings or an array of objects. Please share the Elasticsearch schema that can work for both types.

CodePudding user response:

The only way to store either objects or string in the same field is by using the following mapping, i.e. a disabled object

  "testingName": {
    "type": "object",
    "enabled": false
  }

This means that your source documents will be allowed to have a testingName field which contains either a string or an object but the BIG drawback is that you will not be able to query on that field.

In terms of data design, it makes no sense to have such "bicephalic" fields, at least ES doesn't support that. You would be better off having two different fields, one for strings and another for objects.

UPDATE:

Here is a full recreation of the solution:

PUT test
{
  "mappings": {
    "properties": {
      "testingName": {
        "type": "object",
        "enabled": false
      }
    }
  }
}

PUT test/_doc/1
{
  "testingName": [
    "test"
  ]
}

PUT test/_doc/2
{
  "testingName": [
    {
      "alternateName": "test",
      "areacategory": "test"
    }
  ]
}

GET test/_search

Returns

"hits" : [
  {
    "_index" : "test",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : 1.0,
    "_source" : {
      "testingName" : [
        "test"
      ]
    }
  },
  {
    "_index" : "test",
    "_type" : "_doc",
    "_id" : "2",
    "_score" : 1.0,
    "_source" : {
      "testingName" : [
        {
          "alternateName" : "test",
          "areacategory" : "test"
        }
      ]
    }
  }
]
  • Related