I have data from twitter where I need to find if the word 'off' is present in the tweet but drop those rows where 'off' is present in any twitter handles. These mentions can be anywhere in the entire tweet string and cases like @offersforyou or @groffers are need to be dropped if 'off' is present anywhere in the mention, ie, anywhere in string between @ and immediate next blank space.
I am using mysql version 8.0.28 (ICU regex flavor).
CodePudding user response:
Copied from CLI:
Server version: 8.0.28 MySQL Community Server - GPL
Copyright (c) 2000, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> WITH cte AS ( SELECT '@offersforyou' txt UNION ALL
-> SELECT '@groffers' UNION ALL
-> SELECT 'offers' UNION ALL
-> SELECT 'off word' )
-> SELECT txt, REGEXP_LIKE(txt, '\\boff\\b') FROM cte;
--------------- -------------------------------
| txt | REGEXP_LIKE(txt, '\\boff\\b') |
--------------- -------------------------------
| @offersforyou | 0 |
| @groffers | 0 |
| offers | 0 |
| off word | 1 |
--------------- -------------------------------
4 rows in set (0.03 sec)
CodePudding user response:
"anywhere in string between @ and immediate next blank space" -->
REGEXP "@\\w*off\\w*\\s"