Home > front end >  Exclude returned data when specific words or phrases exist
Exclude returned data when specific words or phrases exist

Time:01-01

An example follows of data that is being returned.

ID CensoredWord DescriptionSnippet
1 anus anus
2 anus manuscript submitted
3 anus tetanus vaccination
4 anus oceanus proposal
5 rere prerequisite includes

The Description Snippet contains the censored word within another word or within a phrase and could be multiple sentences long.

I'd like to exclude data from being returned when the word is anus and the snippet contains the word tetanus or manuscript or oceanus and likewise with the word rere and the snippet contains prerequisite.

I've attempted various methods around WHERE

CensoredWord = 'anus' and DescriptionSnippit NOT LIKE '%tetanus%'

OR CensoredWord = 'anus' and DescriptionSnippit NOT LIKE '%manuscript%'

OR CensoredWord = 'anus' and DescriptionSnippit NOT LIKE '%oceanus%'

OR CensoredWord = 'rere' and DescriptionSnippit NOT LIKE '%prerequisite%'

But I am coming up short. What should this look like?

CodePudding user response:

You can use a regexp that searched for description_snippets that have at least one letter before or after the censored_word.

select * from test where description_snippet regexp concat("[[:alpha:]]",censored_word,"|",censored_word,"[[:alpha:]]");

http://sqlfiddle.com/#!9/a471f3/1

CodePudding user response:

Assuming that you just don't want the sentence to contain the whole censored word.

Then using a regex with wordboundaries should work in Postgresql

where description_snippet !~ concat('\m(', censored_word, ')\M') 

Test

create table test (
 ID serial primary key, 
 censored_word varchar(30),
 description_snippet varchar(30)
);

insert into test (id, censored_word, description_snippet) values
  (1, 'anus', 'anus')
, (2, 'anus', 'manuscript submitted')
, (3, 'anus', 'tetanus vaccination') 
, (4, 'anus', 'oceanus proposal')
, (5, 'rere', 'prerequisite includes')
, (6, 'rere', 'no rere without anus')
select *
from test
where description_snippet !~ concat('\m(', censored_word, ')\M') 
id censored_word description_snippet
2 anus manuscript submitted
3 anus tetanus vaccination
4 anus oceanus proposal
5 rere prerequisite includes

db<>fiddle here

  • Related