I'm an Elastic beginner and I have trouble understanding how to find the most popular search terms used by my users.
Each time a user searches for something, Logstash enters a document such as this in Elastic:
{
"_index" : "user_searches-2022.02.14",
"_type" : "doc",
"_id" : "xGQA-H4BVgDEPVU6QZPf",
"_score" : 1.0,
"_source" : {
"message" : """[Large line in apache combined log format]""",
"@timestamp" : "2022-02-14T11:31:13.395Z",
"search_string": "hello world",
"search_terms" : ["hello", "world"]
}
},
The search_string
is extracted from the URL; the search_terms
is the search_string
splitted (only one of these is needed, but I'm not yet certain which one).
I can't figure out what query can give me the counts of the search terms. I've had some success using "significant_text": {"field: "search_string"}
, but it treats the whole string as a term, it doesn't split it into words. _termvectors
, on the other hand, appears to only work on a single document, not on the entire index.
CodePudding user response:
I assume you want to count hello
and world
separately and I assume that type of search_terms
is text
in your mapping. If so, if you set fielddata
to true
in your mapping for search_terms
field, you can use terms aggregation as below to get the count of each word.
{
"size": 0,
"aggs": {
"asd": {
"terms": {
"field": "search_terms",
"size": 10
}
}
}
}
Note that usign fielddata=true for text fields can cause high memory usage.
If search_terms
field's type is keyword
in the index mapping, you should be able to get the count with the above query without setting fielddata
CodePudding user response:
Here's how I did it in the end, without changing anything else:
GET /user_searches-*/_search
{
"size": 0,
"aggs": {
"search_term_count": {
"terms": {
"field": "search_terms.keyword"
}
}
}
}