I have two tables as follows :
CREATE TABLE keyword_tbl
(
WORDS VARCHAR(100),
TOPIC VARCHAR(100)
);
INSERT INTO keyword_tbl
VALUES ('leaf', 'nature'), ('leaves', 'nature'),
('wind', 'nature'), ('knife', 'utensils'),
('knives', 'utensils'), ('calf', 'animal'),
('calves', 'animal')
CREATE TABLE content
(
CONTENT_ID VARCHAR(100),
DESCRIPTION VARCHAR(100)
);
INSERT INTO content
VALUES ('uuid1', 'leaves fall in autumn like leafs'),
('uuid2', 'the calf is playing in the leaf, the knife' ),
('uuid3', 'knives cutting the wind'),
('uuid4', 'he says hi'),
('uuid5', 'the calves running through the wind')
I want to be able to count the occurrences of each word per topic. My ideal output would look as follows.
content_id | description | nature | utensils | animal |
---|---|---|---|---|
uuid1 | leaves fall in autumn like leafs | 2 | 0 | 0 |
uuid2 | the calf is playing in the leaf, the knife | 1 | 1 | 1 |
uuid3 | knives cutting the wind | 1 | 1 | 0 |
uuid4 | he says hi | 0 | 0 | 0 |
uuid5 | the calves running through the wind | 1 | 0 | 1 |
------------ | --------------------------------------------- | -------- | ---------- | -------- |
Explanation :
- For uuid1, we count
leaves
andleaf
hence nature has a value of 2, - For uuid2, we count
calf
,leaf
,knife
hence nature, utensils and animal have a count of 1, etc...
Is there a way for this to be done autonomously?
CodePudding user response:
Create Split
Function like this
CREATE FUNCTION [dbo].[Split]
(
@String varchar(8000), @Delimiter char(1)
)
returns @temptable TABLE (items varchar(8000))
as
begin
declare @idx int
declare @slice varchar(8000)
select @idx = 1
if len(@String)<1 or @String is null return
while @idx!= 0
begin
set @idx = charindex(@Delimiter,@String)
if @idx!=0
set @slice = left(@String,@idx - 1)
else
set @slice = @String
if(len(@slice)>0)
insert into @temptable(Items) values(@slice)
set @String = right(@String,len(@String) - @idx)
if len(@String) = 0 break
end
return
end
Use To get string count from your table
select CONTENT_ID,DESCRIPTION,
(select COUNT(1) from keyword_tbl where WORDS in (select items from Split(DESCRIPTION,' ')) AND TOPIC = 'nature') as animal,
(select COUNT(1) from keyword_tbl where WORDS in (select items from Split(DESCRIPTION,' ')) AND TOPIC = 'utensils') as nature,
(select COUNT(1) from keyword_tbl where WORDS in (select items from Split(DESCRIPTION,' ')) AND TOPIC = 'animal') as utensils from content
here is string split from ' '
your string in leafs
and leaf
are different that is not count.
CodePudding user response:
To handle "leaf", "leafs" the join condition needs to be altered:
-- substring
ON TRIM(s.value) ILIKE k.words|| '%'
-- only 's'
ON TRIM(s.value) ILIKE ANY (k.words, k.words|| 's')
Output: