Home > other >  Is it possible to extract the stored value of a keyword field when _source is disabled in Elasticsea
Is it possible to extract the stored value of a keyword field when _source is disabled in Elasticsea

Time:02-02

I have the following index:

    {
      "articles_2022" : {
        "mappings" : {
          "_source" : {
            "enabled" : false
          },
          "properties" : {
            "content" : {
              "type" : "text",
              "norms" : false
            },
            "date" : {
              "type" : "date"
            },
            "feed_canonical" : {
              "type" : "boolean"
            },
            "feed_id" : {
              "type" : "integer"
            },
            "feed_subscribers" : {
              "type" : "integer"
            },
            "language" : {
              "type" : "keyword",
              "doc_values" : false
            },
            "title" : {
              "type" : "text",
              "norms" : false
            },
            "url" : {
              "type" : "keyword",
              "doc_values" : false
            }
          }
        }
      }
    }

I have a very specific one-time need and I want to extract the stored values from the url field for all documents. Is this possible with Elasticsearch 7? Thanks!

CodePudding user response:

Since in your index mapping, you have defined url field as of keyword type and have "doc_values": false. Therefore you cannot perform terms aggregation on this.

As far as I can understand your question, you only need to get the value of the of the url field in several documents. For that you can use exists query

Adding a working example

Index Mapping:

PUT idx1
{
  "mappings": {
    "properties": {
      "url": {
        "type": "keyword",
        "doc_values": false
      }
    }
  }
}

Index Data:

POST idx1/_doc/1
{
  "url":"www.google.com"
}

POST idx1/_doc/2
{
  "url":"www.youtube.com"
}

Search Query:

POST idx1/_search
{
  "_source": [
    "url"
  ],
  "query": {
    "exists": {
      "field": "url"
    }
  }
}

Search Response:

"hits" : [
      {
        "_index" : "idx1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "url" : "www.google.com"
        }
      },
      {
        "_index" : "idx1",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "url" : "www.youtube.com"
        }
      }
    ]

CodePudding user response:

As your

"_source" : { "enabled" : false }

You can add mapping "store:true" for the field that you want to extract value of.

As

PUT indexExample2
{
  "mappings": {
    "_source": {
      "enabled": false
    }, 
    "properties": {
      "url": {
        "type": "keyword",
        "doc_values": false,
        "store": true
      }
    }
  }
}

Now once you index data, @ESCoder Thanks for example.

POST indexExample2/_doc/1
{
  "url":"www.google.com"
}

POST indexExample2/_doc/2
{
  "url":"www.youtube.com"
}

You can extract only the stored field in your search queries, even if _source is disabled.

POST indexExample2/_search
{
  "query": {
    "exists": {
      "field": "url"
    }
  },
  "stored_fields": ["url"]
} 

This will o/p as:

"hits" : [
  {
    "_index" : "indexExample2",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : 1.0,
    "fields" : {
      "url" : [
        "www.google.com"
      ]
    }
  },
  {
    "_index" : "indexExample2",
    "_type" : "_doc",
    "_id" : "2",
    "_score" : 1.0,
    "fields" : {
      "url" : [
        "www.youtube.com"
      ]
    }
  }
]
  •  Tags:  
  • Related