Home > other >  SQL query regex to fetch only rows that match word in string
SQL query regex to fetch only rows that match word in string

Time:12-15

I am trying to construct a query that will find a test word or phrase in a string. I tried to use regexp as in the code below but that returns all the Rows in the table and probably only displays my lack of knowledge.

I thought it would only fetch the rows with "man" in the string.

pagedata table structure

id keywords
1 "man in the middle man on moon"
2 "walking in the rain"
3 "man overboard"
4 "manage"
$teststring = "man";
$query = "SELECT * FROM pagedata where keywords regexp '(^|[[:space:]])?([[:space:]]|$)'";
$stmt = $conn->prepare($query);
$stmt->execute(["%$teststring%"]);
$results = $stmt->fetchAll();

A code example would be much appreciated.

Thank you

Milo

CodePudding user response:

If you are using MySQL 8.0, test for begin/end of a "word" with \\b, not "space".

Also, be aware that your client may replace the placeholder, ?, by a quoted string. This would require the quotes to be in the text -- not what you want. This might solve that problem:

SELECT * FROM pagedata
    WHERE keywords REGEXP CONCAT('\\b', ?, '\\b');

mysql> SELECT "man in the middle man on moon" REGEXP CONCAT('\b', 'man', '\b'); ERROR 4031 (HY000): The client was disconnected by the server because of inactivity. See wait_timeout and interactive_timeout for configuring this behavior. No connection. Trying to reconnect... Connection id: 695095 Current database: covid

 -------------------------------------------------------------------- 
| "man in the middle man on moon" REGEXP CONCAT('\\b', 'man', '\\b') |
 -------------------------------------------------------------------- 
|                                                                  1 |
 -------------------------------------------------------------------- 
mysql> SELECT "manage" REGEXP CONCAT('\\b', 'man', '\\b');
 --------------------------------------------- 
| "manage" REGEXP CONCAT('\\b', 'man', '\\b') |
 --------------------------------------------- 
|                                           0 |
 --------------------------------------------- 

(0==false, non-zero==true)

But, even better would be to use a FULLTEXT(keywords) index, plus

    WHERE MATCH(keywords) AGAINST (? IN BOOLEAN MODE)

For large tables, MATCH is much faster. But FULLTEXT has limitations.

  • Related