Elasticsearch use filter fo filter out unwanted keywords-CodePudding

Now I am having some data in the following form:

df = pd.DataFrame([['foo','some text',1, 13],['foo','Another text',2, 4],['foo','Third text',3, 10],['bar','Text1',2, 25], ['bar','Long text',1, 17],['num','short text',3, 0],['num','fifth text',3, 8]], index = range(1,8), columns = ['category','text','label', 'count'])

I've put the documents into an es index and try to searh with the condition of getting "count" that is greater than 0 and less than 10, and "category" that is not "foo".

I tried to use the "none" clause in "filter" clause of a boolean query, but it gives the error of "no query registered for [none]".

text: "text"
data = json.dumps({
    "query":{
        "bool":{
            "should":[
                {
                    "match":{
                        "text":text
                    }
                }
            ],
            "filter": [
                {
                    "range": {
                        "count": {
                            "from": 0,
                            "to": 10
                    }
                    }
                },
                {
                    "none": {
                        "term": {
                            "category.keyword": "foo"
                        }
                    }
                }
            ]
            
        }
    }
})

So I am now using the "must_not" clause as below:

text: "text"
data = json.dumps({
    "query":{
        "bool":{
            "should":[
                {
                    "match":{
                        "text":text
                    }
                }
            ],
            "filter": [
                {
                    "range": {
                        "count": {
                            "from": 0,
                            "to": 10
                    }
                }
                }
            ]
            ,
            "must_not":[
                {
                    "term": {
                        "category.keyword": "foo"
                    }
                }
            ]
        }
    }
})

Is there a way to use "none" in the "filter" clause and to make the query work more efficiently? Thank you!

CodePudding user response：

As mentioned in the documentation, scoring is ignored for the must_not clause, so there will be no impact on the performance of the query (used above), even if must_not clause is included outside the filter clause.

The clause (query) must not appear in the matching documents. Clauses are executed in filter context meaning that scoring is ignored and clauses are considered for caching. Because scoring is ignored, a score of 0 for all documents is returned.

And apart from this, there is no none query, instead, there is only Match None query, which matches no documents.