Home > Enterprise >  SQL Server: exclude records whose name column matches values in a 2nd exclusions table
SQL Server: exclude records whose name column matches values in a 2nd exclusions table

Time:03-30

In SQL Server, I have a table that contains names of doctor's offices (Providers) that I want to return a list of. If any of these office names contain any keywords found in a 2nd table (Keywords), I want to exclude that Provider record from being returned in the list. I have a query that works fine, however my problem is...

I need to match WHOLE words in the Provider's name to the keyword, in order to exclude it. If a keyword is merely a portion of a word contained in the doctor's office name, that record shouldn't be excluded. I'm not sure how to do this.

Here's an example:

IF OBJECT_ID(N'tempdb..#Keywords') IS NOT NULL
BEGIN
     DROP TABLE #Keywords
END
GO

IF OBJECT_ID(N'tempdb..#Providers') IS NOT NULL
BEGIN
    DROP TABLE #Providers
END
GO

CREATE TABLE #Keywords (KeywordId INT, Keyword varchar(50))
CREATE TABLE #Providers (ProviderId INT, ProviderName varchar(100))

INSERT INTO #Keywords(KeywordId, Keyword) VALUES (1, 'ACUPUNCTURE')
INSERT INTO #Keywords(KeywordId, Keyword) VALUES (2, 'CHILD & ADOLESCENT PSYCHIATRY')
INSERT INTO #Keywords(KeywordId, Keyword) VALUES (3, 'Ent')
INSERT INTO #Keywords(KeywordId, Keyword) VALUES (4, 'VA')

INSERT INTO #Providers(ProviderId, ProviderName) VALUES (111,'Family Practice of Pennsylvania')
INSERT INTO #Providers(ProviderId, ProviderName) VALUES (222,'Dr. Smiths Acupuncture')
INSERT INTO #Providers(ProviderId, ProviderName) VALUES (333,'Patient First')
INSERT INTO #Providers(ProviderId, ProviderName) VALUES (444,'Chicago Child & Adolescent Psychiatry')
INSERT INTO #Providers(ProviderId, ProviderName) VALUES (555,'Manhattan Family Practice')
INSERT INTO #Providers(ProviderId, ProviderName) VALUES (666,'Dr. Levys Oncology VA')

SELECT * FROM #Keywords
select * FROM #Providers

SELECT p.* 
FROM #Providers p
WHERE NOT EXISTS (SELECT 1
                   FROM   #Keywords k
                   WHERE  p.ProviderName LIKE CONCAT('%', k.Keyword, '%'))

In the above sql, any provider's offices that contain any of those 4 values in the #Keywords table, should be excluded. Therefore, since ProviderId's 222, 444 & 666 have matching keywords, they should be excluded and I should return the following 3 records:

  • 111, Family Practice of Pennsylvania
  • 333, Patient First
  • 555, Manhattan Family Practice

My problem however, if you run that above script, you'll notice only ONE record is returned. This is because ProviderId 111 contains "VA" in "Pennsylvania" and ProviderId 333 contains "ENT" in "Patient".

How can I re-work my query to ensure my keywords match a WHOLE word from a Provider's office name rather than just a portion of one?

CodePudding user response:

You need to compare whole words only - so keywords like 'Ent' aren't matched within a word.

Does the following work for you by including a space either-side of the keyword and the same with the name:

where not exists (
  select 1 from #Keywords k
  where Concat(' ',p.ProviderName,' ') like Concat('% ', k.Keyword, ' %')
)

enter image description here

  • Related