Home > database >  Aggregation query fails using ElasticSearch Python client
Aggregation query fails using ElasticSearch Python client

Time:11-08

Here is an aggregation query that works as expected when I use dev tools in on Elastic Search :  

   search_query = {
      "aggs": {
        "SHAID": {
          "terms": {
            "field": "identiferid",
            "order": {
              "sort": "desc"
            },
    #         "size": 100000
          },
          "aggs": {
            "update": {
              "date_histogram": {
                "field": "endTime",
                "calendar_interval": "1d"
              },
              "aggs": {
                "update1": {
                      "sum": {
                        "script": {
                          "lang": "painless",
                          "source":"""
                              if (doc['distanceIndex.att'].size()!=0) { 
                                  return doc['distanceIndex.att'].value;
                              } 
                              else { 
                                  if (doc['distanceIndex.att2'].size()!=0) { 
                                  return doc['distanceIndex.att2'].value;
                              }
                              return null;
                              }
                              """
                        }
                      }
                    },
                "update2": {
                         "sum": {
                        "script": {
                          "lang": "painless",
                          "source":"""
                              if (doc['distanceIndex.att3'].size()!=0) { 
                                  return doc['distanceIndex.att3'].value;
                              } 
                              else { 
                                  if (doc['distanceIndex.at4'].size()!=0) { 
                                  return doc['distanceIndex.att4'].value;
                              }
                              return null;
                              }
                              """
                        }
                      }
                  },
              }
            },
            "sort": {
              "sum": {
                "field": "time2"
              }
            }
          }
        }
      },
    "size": 0,
      "query": {
        "bool": {
          "filter": [
            {
              "match_all": {}
            },
            {
              "range": {
                "endTime": {
                  "gte": "2021-11-01T00:00:00Z",
                  "lt": "2021-11-03T00:00:00Z"
                }
              }
            }
          ]
        }
      }
    }

When I attempt to execute this aggregation using the Python ElasticSearch client (https://elasticsearch-py.readthedocs.io/en/v7.15.1/) I receive the exception :

exception search() got multiple values for keyword argument 'size'

If I remove the attribute :

"size": 0,

From the query then the exception is not thrown but the aggregation does not run as "size": 0, is required for an aggregation.

Is there a different query format I should use for performing aggregations using the Python ElasticSearch client ?

Update :

Here is code used to invoke the query :

import elasticsearch
from elasticsearch import Elasticsearch, helpers

es_client = Elasticsearch(
    ["https://test-elastic.com"],
    scheme="https",
    port=443,
    http_auth=("test-user", "test-password"),
    maxsize=400,
    timeout=120,
    max_retries=10,
    retry_on_timeout=True
)

query_response = helpers.scan(client=es_client,
                                     query=search_query,
                                     index="test_index",
                                     clear_scroll=False,
                                     request_timeout=1500)

rows = []
try:
    for row in query_response:
        rows.append(row)
except Exception as e:
    print('exception' , e)
        

Using es_client :

es_client.search(index="test_index", query=search_query)

results in error :

/opt/oss/conda3/lib/python3.7/site-packages/elasticsearch/connection/base.py in _raise_error(self, status_code, raw_data)
    336 
    337         raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
--> 338             status_code, error_message, additional_info
    339         )
    340 

RequestError: RequestError(400, 'parsing_exception', 'unknown query [aggs]')

Is aggs valid for search api ?

CodePudding user response:

helpers.scan is a

Simple abstraction on top of the scroll() api - a simple iterator that yields all hits as returned by underlining scroll requests.

It's meant to iterate through large result sets and comes with a default keyword argument of size=1000

To run an aggregation, use the es_client.search() method directly, passing in your query as body, and including "size": 0 in the query should be fine.

  • Related