Home > Enterprise >  Word search in Laravel: returning the nearest matches first (MySql 5.7)
Word search in Laravel: returning the nearest matches first (MySql 5.7)

Time:07-05

I am building a sort of urban dictionary for the language of my home island. I have this kind of set up in the database:

table: words table: word_definitions (has a word_id column)

I am just using Eloquent to the pattern matching in a scope function on my Word.php model.

public function scopeWhereLike(Builder $query, string $searchString): Builder
{
    $lowercasedSearchString = Str::lower($searchString);

    return $query->where('word_definitions.definition', 'like', "%{$lowercasedSearchString}%")
        ->orWhere('word_definitions.example_sentence', 'like', "%{$lowercasedSearchString}%")
        ->orWhere('words.word', 'like', "%{$lowercasedSearchString}%");
}

I am not performing any kind of ordering on it at the moment.

The problem I am trying to solve is this:

If someone searches the word "wi" (meaning "we" in the language"), the first result isn't "wi", it returns other words that contains "wi" within their definitions first:

picture of my app

Is there a recognised pattern to be able to set up some kind of rules like "order 1: exact matching words, order 2: any other partial matched words, order 3: any other word definitions

CodePudding user response:

Looking at your requirement, the best possible approach is to use MySQL Full text search as already mentioned.


PHP Approach:

You can use levenshtein approach to sort the results, however since this uses a insertion, replacement and deletion cost, it can cause varying results where you wish to have partial matches above smaller strings which have smaller conversion costs.

Custom way:

  • We would go as per your column ordering of word, definition and then example_sentence for sort order.

  • We will split the search string into token of strings by exploding them based on space.

  • We will get the match count of how many tokens were found in the current string at hand and accordingly do the order by.

  • Assuming you have the array of results in $data variable and $search having the user input, then you can try the below:

Snippet:

<?php

// sort by match similarity of word, definition, example_sentence

usort($data, function($a, $b) use ($search){
    $a_word = $a['word'];
    $b_word = $b['word'];
    $a_word_count = matchCount($a_word, $search);
    $b_word_count = matchCount($b_word, $search);
    if($a_word_count !== $b_word_count) return $b_word_count <=> $a_word_count;


    $a_definition = $a['definition'];
    $b_definition = $b['definition'];
    $a_definition_count = matchCount($a_definition, $search);
    $b_definition_count = matchCount($b_definition, $search);
    if($a_definition_count !== $b_definition_count) return $b_definition_count <=> $a_definition_count;


    $a_example_sentence = $a['example_sentence'];
    $b_example_sentence = $b['example_sentence'];
    $a_example_sentence_count = matchCount($a_example_sentence, $search);
    $b_example_sentence_count = matchCount($b_example_sentence, $search);
    return $b_example_sentence_count <=> $a_example_sentence_count;
});

function matchCount($word, $search){
    $match_count = 0;
    $word = preg_replace("#[[:punct:]]#", "", $word);
    foreach(explode(" ", $search) as $w){
        if(stripos($word, $w) !== false)    $match_count  ; 
    }
    return $match_count;
}

print_r($data);

Example Demo of what I mean to illustrate.

  • Related