I need to query all values from a certain field in my elastic index. When I search for terms in the elasticsearch dev console, I get the results as expected:
GET index/_search
{
"aggs" : {
"All_IDs" : {
"terms" : { "field" : "ID", "size":10000 }
}
},
"size" : 0
}
response:
"aggregations" : {
"All_IDs" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "XX05215",
"doc_count" : 4560
},
{
"key" : "XX05216",
"doc_count" : 3364
},
{
"key" : "E1004903",
"doc_count" : 2369
}....
That's good! But, when I use the elasticsearch client in python, the response contains the aggregation, but I also get flushed with the data from the entire database, which is too much overhead:
es = Elasticsearch(
hosts = [{'host': host, 'port': 443},],
http_auth = awsauth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection
)
query = {
"aggs" : {
"All_IDs" : {
"terms" : { "field" : "ID", "size":10000 }
}
},
"size" : 0
}
response = es.search( index='index', body=query, size=9999 )
How can I query in python the same way as in the console and retrieve only the desired ID's?
CodePudding user response:
Issue was with the size
param passed in the query request as shown in below request.
es.search( index='index', body=query, size=9999 )
Once its removed, it used the size
param passed in the query body.