Home > Back-end >  Elasticsearch range query date format for java not returning all the results
Elasticsearch range query date format for java not returning all the results

Time:10-05

I have an elasticsearch index that contains fields with a timestamp in the format "dd/MM/yyyy HH:mm:ss" and a customer name. I need to delete records added to the index before a certain timestamp. Using deletebyquery api for java I had the following code:

DeleteByQueryRequest request =
            new DeleteByQueryRequest(index);
    //request.setQuery(new TermQueryBuilder(customerKeywordField, customerName));
    BoolQueryBuilder query = QueryBuilders.boolQuery()
            .filter(QueryBuilders.termsQuery(customerKeywordField, customerName))
            .filter(QueryBuilders.rangeQuery("createdDate.keyword").lte(timestamp));
    request.setQuery(query);

    try {
        BulkByScrollResponse bulkResponse =
                restHighLevelClient.deleteByQuery(request, RequestOptions.DEFAULT);
        
        }
    } catch (Exception e) {
        //exception handling
    }

and this was working as intended, however now that the dates are "01/10/2021", it no longer returns records for "29/09/2021" or the like as part of results, so I assume it is taking date format as "MM/dd/yyyy" instead

I tried setting format as

QueryBuilders.rangeQuery("createdDate.keyword").lte(timestamp).format("*dd/MM/yyyy HH:mm:ss*")

but that did not work either. I have verified that the timestamp being passed is in the appropriate format and the timestamp on the record is in the correct format too, so I am at a loss. If someone can help, I would appreciate it.

EDIT: Mapping

{
  "Index_x" : {
    "mappings" : {
      "properties" : {
        //other fields
        "createdDate" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "customer" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

Sample document:

{
  "_index": "Index_x",
  "_type": "_doc",
  "_id": "1632381612786",
  "_score": 1,
  "_source": {
    "customer": "customer1",
    "createdDate": "23/09/2021 12:49:44",
    //other fields
  },
  "fields": {
    
    "customer.keyword": [
      "customer1"
    
    "createdDate": [
      "23/09/2021 12:49:44"
    ],
    "createdDate.keyword": [
      "23/09/2021 12:49:44"
    ]
    "customer": [
      "customer1"
    ]//other fields
    
  }
}

CodePudding user response:

The field createdDate.keyword is a keyword , not date as shown in your mapping (see mapping types doc):

"createdDate" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    }

You should change the field type to date.

If you are using dynamic mapping, during indexing, the string field createdDate is checked to see whether its contents match any of the date patterns in order to add it as a new date field. It seems that your format is not supported by default.

To solve this, you could customize the dynamic_date_formats to support your own date format.

Note that range queries on keyword fields are considered expensive and will not be executed by default, see docs.

  • Related