Home > Net >  ElasticSearch - Search middle of words over multiple fields
ElasticSearch - Search middle of words over multiple fields

Time:05-03

I'm trying to retrieve documents that have a phrase in them, not necessarily at the start of the word, over multiple document fields.

Such as "ell" should match a document field "hello". And do this on two fields.

I initially went with MultiMatch due to this SO answer. Here was my implementation:

QueryContainer &= Query<VeganItemEstablishmentSearchDto>.MultiMatch(c => c
    .Fields(f => f.Field(p => p.VeganItem.Name).Field(v => v.VeganItem.CompanyName))
    .Query(query)
    .MaxExpansions(2)
    .Slop(2)
    .Name("named_query")
);

But I found that it would only match "hello" if my search phrase started with the start of the word e.g. it would not match "ello".

So I then changed to QueryString due to this SO answer. My implementation was:

QueryContainer &= Query<VeganItemEstablishmentSearchDto>.QueryString(c => c
    .Fields(f => f.Field(p => p.VeganItem.Name).Field(v => v.VeganItem.CompanyName))
    .Query(query)
    .FuzzyMaxExpansions(2)
    .Name("named_query")
);

But I found that was even worse. It didn't search multiple fields, only p.VeganItem.Name and still "ello" was not matching "hello".

How do I use Nest to search for a term that can be in the middle of a word and over multiple document fields?

CodePudding user response:

You will need to use wild card query for this scenario, for more information about wild cards query check here, and for nest WildQueries check here.

To do wild card query in Nest you can do like this:

new QueryContainer[]
 {
     Query<VeganItemEstablishmentSearchDto>.Wildcard(w => w
     .Field(v => v.VeganItem.CompanyName))
     .Value(query)),
     Query<VeganItemEstablishmentSearchDto>.Wildcard(w => w
     .Field(p => p.VeganItem.Name))
     .Value(query)
 }

Your should add asterisk (*) in the beginning and end of your query.

Please keep in your mind that wildCard queries are expensive and you might want to achieve these by having different Analyzer in your mapping.

CodePudding user response:

Wildcard queries are expensive, if you want to customize and allow how many middle characters you want to search, you can do it using the n-gram tokenizer, that would be less expensive and will provide more customisation/flexibility to you.

I've also written a blog post on implementing the autocomplete and its various trade-offs with performance and functional requirements.

  • Related