How to search in ElasticSearch the most common word of a single field in a single document? Lets say I have a document that have a field "pdf_content" of type keyword containing:
"good polite nice good polite good"
I would like a return of
{
word: good,
occurences: 3
},
{
word: polite,
occurences: 2
},
{
word: nice,
occurences: 1
},
How is this possible using ElasticSearch 7.15?
I tried this in the Kibana console:
GET /pdf/_search
{
"aggs": {
"pdf_contents": {
"terms": { "field": "pdf_content" }
}
}
}
But it only returns me the list of PDFs i have indexed.
CodePudding user response:
Have you ever tried term_vector?:
Basically, you can do:
Mappings:
{
"mappings": {
"properties": {
"pdf_content": {
"type": "text",
"term_vector": "with_positions_offsets_payloads"
}
}
}
}
with your sample document:
POST /pdf/_doc/1
{
"pdf_content": "good polite nice good polite good"
}
Then you can do:
GET /pdf/_termvectors/1
{
"fields" : ["pdf_content"],
"offsets" : false,
"payloads" : false,
"positions" : false,
"term_statistics" : false,
"field_statistics" : false
}
If you want to see other information, you can set them to true
. Set all to false
give you what you want.