Home > Software design >  How to retrieve elasticsearch data from index based on timestamp?
How to retrieve elasticsearch data from index based on timestamp?

Time:11-02

I want to retrieve data from elasticsearch based on timestamp. The timestamp is in epoch_millis and I tried to retrieve the data like this:

{
  "query": {
    "bool": {
      "must":[ 
              {
                "range": {
                  "TimeStamp": {
                    "gte": "1632844180",
                    "lte": "1635436180"
                  }
                }
              }
      ]
    }
  },
  "size": 10
}

But the response is this:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

How can I retrieve data for a given period of time from a certain index?

The data looks like this:


    {
        "_index" : "my-index",
        "_type" : "_doc",
        "_id" : "zWpMNXcBTeKmGB84eksSD",
        "_score" : 1.0,
        "_source" : {
          "Source" : "Market",
          "Category" : "electronics",
          "Value" : 20,
          "Price" : 45.6468,
          "Currency" : "EUR",
          "TimeStamp" : 1611506922000        }

Also, the result has 10.000 hits when using the _search on the index. How could I access other entries? (more than 10.000 results) and to be able to choose the desired timestamp interval.

CodePudding user response:

For your first question, assume that you have the mappings like this:

{
    "mappings": {
        "properties": {
            "Source": {
                "type": "keyword"
            },
            "Category": {
                "type": "keyword"
            },
            "Value": {
                "type": "integer"
            },
            "Price": {
                "type": "float"
            },
            "Currency": {
                "type": "keyword"
            },
            "TimeStamp": {
                "type": "date"
            }
        }
    }
}

Then I indexed 2 sample documents (1 is yours above, but the timestamp is definitely not in your range):

[{
    "Source": "Market",
    "Category": "electronics",
    "Value": 30,
    "Price": 55.6468,
    "Currency": "EUR",
    "TimeStamp": 1633844180000
},
{
    "Source": "Market",
    "Category": "electronics",
    "Value": 20,
    "Price": 45.6468,
    "Currency": "EUR",
    "TimeStamp": 1611506922000
}]

If you really need to query using the range above, you will first need to convert your TimeStamp field to seconds (/1000), then query based on that field:

{
    "runtime_mappings": {
    "secondTimeStamp": {
      "type": "long",
      "script": "emit(doc['TimeStamp'].value.millis/1000);"
    }
  },
    "query": {
        "bool": {
            "must": [
                {
                    "range": {
                        "secondTimeStamp": {
                            "gte": 1632844180,
                            "lte": 1635436180
                        }
                    }
                }
            ]
        }
    },
    "size": 10
}

Then you will get the first document.

About your second question, by default, Elasticsearch's max_result_window is only 10000. You can increase this limit by updating the settings, but it will increase the memory usage.

PUT /index/_settings

{
   "index.max_result_window": 999999
}

You should use the search_after API instead.

  • Related