I'm very new to the concept or Regular Expressions and am looking for a wildcard search solution that allows 2 or fewer characters of the string to be wrong/missing/blank, in Snowflake. For example, if I have a table's column of basketball players' names such as 'lebron james', 'carmelo anthony', 'kobe bryant', below are the results I would like to have matched from another table (consumers' search queries) for 'lebron james':
'lebrn james' (missing 'o')
'lebronjames' (missing a space between fn and ln)
'lebrn jme' (missing 'o' and 'a')
'lebron james' (exact match)
Would anyone be so kind to provide some guidance?
CodePudding user response:
EDITDISTANCE is what you are asking for:
with input(str) as (
select * from values
('lebrn james'), ('lebronjames'), ('lebrn jme')
), targets(str) as (
select * from values
('lebron james'), ('carmelo anthony'), ('kobe bryant')
)
select i.str, t.str, editdistance(i.str, t.str)
from input i
cross join targets t;
gives:
STR | STR_2 | EDITDISTANCE(I.STR, T.STR) |
---|---|---|
lebrn james | lebron james | 1 |
lebrn james | carmelo anthony | 14 |
lebrn james | kobe bryant | 10 |
lebronjames | lebron james | 1 |
lebronjames | carmelo anthony | 13 |
lebronjames | kobe bryant | 10 |
lebrn jme | lebron james | 3 |
lebrn jme | carmelo anthony | 13 |
lebrn jme | kobe bryant | 9 |