Home > Mobile >  Update a column to remove stop words from another table SQL
Update a column to remove stop words from another table SQL

Time:02-24

I'm having problems with the following code and I was wondering if anyone could help me resolve this issue. I have two tables, tbl and stopwords which you can recreate with the following query :

CREATE TABLE stopwords(
  stopword char(100) NOT NULL
);
insert into stopwords values ('I');
insert into stopwords values ('in');
insert into stopwords values ('on');
insert into stopwords values ('of');
insert into stopwords values ('to');
insert into stopwords values ('a');
insert into stopwords values ('too');

CREATE TABLE tbl (
  id INT,
  message TEXT
);

INSERT INTO tbl VALUES
(1, 'in in in in of of of of on too we are on top of the world'),
(2, 'too bad the apple is rotten'),
(3, 'there is no I in team but a man thinks'),
(4, 'it matters where you go to a random');

I want to update the message column in tbh while removing each stopword from the stopwords table. However, I want to do this process iteratively, instead of manually writing out an update statement for each stopword.

I tried the following query and no errors occur but the message column does not update to anything and does not return any errors.

update tbl 
SET message = TRIM(
  REPLACE(CONCAT(' ', REPLACE(message, sw.stopword, sw.stopword), ' '), sw.stopword, '')
)
from stopwords as sw
where CONCAT(' ', message , ' ')  like CONCAT('%' ,sw.stopword , '%');

Can anyone help me with debugging this?

CodePudding user response:

You need to define a function. We start by adding a space before and after the line. We then loop through the words from stopwords with replace with space. We keep running each replace until the length after the remove is the same as before it. Finally we use TRIM to remove the spaces before and after the string.

CREATE TABLE stopwords(
  stopword char(100) NOT NULL
);
insert into stopwords values ('I');
insert into stopwords values ('in');
insert into stopwords values ('on');
insert into stopwords values ('of');
insert into stopwords values ('to');
insert into stopwords values ('a');
insert into stopwords values ('too');

CREATE TABLE tbl (
  id INT,
  message TEXT
);

INSERT INTO tbl VALUES
(1, 'in in in in of of of of on too we are on top of the world'),
(2, 'too bad the apple is rotten'),
(3, 'there is no I in team but a man thinks'),
(4, 'it matters where you go to a random');

1 rows affected

1 rows affected

1 rows affected

1 rows affected

1 rows affected

1 rows affected

1 rows affected

4 rows affected

create  function removeStop(raw_string varchar(100))
   returns varchar(100)
   language plpgsql
  as
$$
declare 
   raw_str varchar(100);
   pat varchar(10);
   l1 int; l2 int;
   tRow record;
-- variable declaration
begin
  raw_str = concat(' ',raw_string,' ');
  for tRow in
  select stopword s from stopwords
  loop
    pat = concat(' ',trim(from tRow.s),' ');
    l1=1000;l2=length(raw_str);
    while l1>l2 loop
      l1=l2;
      raw_str = replace(raw_str,pat,' ');
      l2 = length(raw_str);
    end loop;
  end loop;
  return trim(from raw_str);
  --return pat;
-- logic
end;
$$
select * from tbl;
id | message                                                  
-: | :--------------------------------------------------------
 1 | in in in in of of of of on too we are on top of the world
 2 | too bad the apple is rotten                              
 3 | there is no I in team but a man thinks                   
 4 | it matters where you go to a random                      
update tbl set message=removeStop(message);

4 rows affected

select * from tbl;
id | message                        
-: | :------------------------------
 1 | we are top the world           
 2 | bad the apple is rotten        
 3 | there is no team but man thinks
 4 | it matters where you go random 

db<>fiddle here

  •  Tags:  
  • sql
  • Related