I am building a sort of urban dictionary for the language of my home island. I have this kind of set up in the database:
table: words table: word_definitions (has a word_id column)
I am just using Eloquent to the pattern matching in a scope function on my Word.php model.
public function scopeWhereLike(Builder $query, string $searchString): Builder
{
$lowercasedSearchString = Str::lower($searchString);
return $query->where('word_definitions.definition', 'like', "%{$lowercasedSearchString}%")
->orWhere('word_definitions.example_sentence', 'like', "%{$lowercasedSearchString}%")
->orWhere('words.word', 'like', "%{$lowercasedSearchString}%");
}
I am not performing any kind of ordering on it at the moment.
The problem I am trying to solve is this:
If someone searches the word "wi" (meaning "we" in the language"), the first result isn't "wi", it returns other words that contains "wi" within their definitions first:
Is there a recognised pattern to be able to set up some kind of rules like "order 1: exact matching words, order 2: any other partial matched words, order 3: any other word definitions
CodePudding user response:
Looking at your requirement, the best possible approach is to use MySQL Full text search
as already mentioned.
PHP Approach:
You can use levenshtein
approach to sort the results, however since this uses a insertion, replacement and deletion cost, it can cause varying results where you wish to have partial matches above smaller strings which have smaller conversion costs.
Custom way:
We would go as per your column ordering of
word
,definition
and thenexample_sentence
for sort order.We will split the search string into token of strings by exploding them based on space.
We will get the match count of how many tokens were found in the current string at hand and accordingly do the order by.
Assuming you have the array of results in
$data
variable and$search
having the user input, then you can try the below:
Snippet:
<?php
// sort by match similarity of word, definition, example_sentence
usort($data, function($a, $b) use ($search){
$a_word = $a['word'];
$b_word = $b['word'];
$a_word_count = matchCount($a_word, $search);
$b_word_count = matchCount($b_word, $search);
if($a_word_count !== $b_word_count) return $b_word_count <=> $a_word_count;
$a_definition = $a['definition'];
$b_definition = $b['definition'];
$a_definition_count = matchCount($a_definition, $search);
$b_definition_count = matchCount($b_definition, $search);
if($a_definition_count !== $b_definition_count) return $b_definition_count <=> $a_definition_count;
$a_example_sentence = $a['example_sentence'];
$b_example_sentence = $b['example_sentence'];
$a_example_sentence_count = matchCount($a_example_sentence, $search);
$b_example_sentence_count = matchCount($b_example_sentence, $search);
return $b_example_sentence_count <=> $a_example_sentence_count;
});
function matchCount($word, $search){
$match_count = 0;
$word = preg_replace("#[[:punct:]]#", "", $word);
foreach(explode(" ", $search) as $w){
if(stripos($word, $w) !== false) $match_count ;
}
return $match_count;
}
print_r($data);
Example Demo
of what I mean to illustrate.