I have the following index:
{
"articles_2022" : {
"mappings" : {
"_source" : {
"enabled" : false
},
"properties" : {
"content" : {
"type" : "text",
"norms" : false
},
"date" : {
"type" : "date"
},
"feed_canonical" : {
"type" : "boolean"
},
"feed_id" : {
"type" : "integer"
},
"feed_subscribers" : {
"type" : "integer"
},
"language" : {
"type" : "keyword",
"doc_values" : false
},
"title" : {
"type" : "text",
"norms" : false
},
"url" : {
"type" : "keyword",
"doc_values" : false
}
}
}
}
}
I have a very specific one-time need and I want to extract the stored values from the url
field for all documents. Is this possible with Elasticsearch 7? Thanks!
CodePudding user response:
Since in your index mapping, you have defined url
field as of keyword
type and have "doc_values": false
. Therefore you cannot perform terms aggregation on this.
As far as I can understand your question, you only need to get the value of the of the url
field in several documents. For that you can use exists query
Adding a working example
Index Mapping:
PUT idx1
{
"mappings": {
"properties": {
"url": {
"type": "keyword",
"doc_values": false
}
}
}
}
Index Data:
POST idx1/_doc/1
{
"url":"www.google.com"
}
POST idx1/_doc/2
{
"url":"www.youtube.com"
}
Search Query:
POST idx1/_search
{
"_source": [
"url"
],
"query": {
"exists": {
"field": "url"
}
}
}
Search Response:
"hits" : [
{
"_index" : "idx1",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"url" : "www.google.com"
}
},
{
"_index" : "idx1",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"url" : "www.youtube.com"
}
}
]
CodePudding user response:
As your
"_source" : { "enabled" : false }
You can add mapping "store:true" for the field that you want to extract value of.
As
PUT indexExample2
{
"mappings": {
"_source": {
"enabled": false
},
"properties": {
"url": {
"type": "keyword",
"doc_values": false,
"store": true
}
}
}
}
Now once you index data, @ESCoder Thanks for example.
POST indexExample2/_doc/1
{
"url":"www.google.com"
}
POST indexExample2/_doc/2
{
"url":"www.youtube.com"
}
You can extract only the stored field in your search queries, even if _source is disabled.
POST indexExample2/_search
{
"query": {
"exists": {
"field": "url"
}
},
"stored_fields": ["url"]
}
This will o/p as:
"hits" : [
{
"_index" : "indexExample2",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"fields" : {
"url" : [
"www.google.com"
]
}
},
{
"_index" : "indexExample2",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"fields" : {
"url" : [
"www.youtube.com"
]
}
}
]