There are a number of related questions on stackoverflow, but they mostly suggest other ways to do wildcards. I'm trying to analyze an existing install, so alternatives are not useful.
I think the issue I ran into was that simple_query_string and wildcard queries do different things with infix *.
the query
r*g
expands to " msg:r msg:g" with simple_query_string:
GET /test/_validate/query?rewrite=true
{
"query": {
"simple_query_string" : {
"query": "r*g",
"fields": ["msg"],
"default_operator": "AND"
}
}
}
Returns
{
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"valid": true,
"explanations": [
{
"index": "test",
"valid": true,
"explanation": " msg:r msg:g"
}
]
}
Which shows that simple query string is not treading this as a wildcard at all. Not even for r* So, it will not match "running", for example.
On the other hand, wildcard query does handle infix.
GET /test/_validate/query?rewrite=true
{
"query": {
"wildcard": {
"msg": {
"value": "r*g",
"case_insensitive": true
}
}
}
}
returns
{
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"valid": true,
"explanations": [
{
"index": "test",
"valid": true,
"explanation": """msg:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@78b3b2e7}"""
}
]
}
While the automaton query could use better output, r*g as a wildcard query will match "running", but a simple_query_string will not.
So, am I correct that the sample query string matches very different sets for simple_query_string vs. wildcard query?
CodePudding user response:
Yes, You are right. simple_query_string
only support wildcard in end of the query string.
*
at the end of a term signifies a prefix query
You can see below line in documentation so it is ignored *
in your scenario.
the
simple_query_string query
does not return errors for invalid syntax. Instead, it ignores any invalid parts of the query string.
You can use query_string
of query but it is strict and validate query syntax.
You can use the query_string query to create a complex search that includes wildcard characters, searches across multiple fields, and more. While versatile, the query is strict and returns an error if the query string includes any invalid syntax.