Home > Enterprise >  String split more than one space for adding new column (keywords capturing)
String split more than one space for adding new column (keywords capturing)

Time:09-11

I'd like to capture keywords which are in 'Description' to know this job is what I am looking for.

I used string split to add one column from a table which is for keywords in job description.

Here's an example

Job (Table name)

ID      Code       Name                Description
1       001       Engineer     This job requires ms sql server.
2       002        CS          You will need to use python and C  .
3       003       Teacher      You must have certificated degree for this position.
4       004       Programmer   Master's degree and C   required.
5       005       Designer     Experience more than 3 years in specific retail industry.
.        .         .                              . 
.        .         .                              .

This

.
.
.

means I have more rows.

list (table name)

   key
ms sql server
Python
Certificated degree
C  
Master's degree
specific retail industry
.
.

This is what I have tried.

SELECT     ID, Code, Name, Description, String_Agg(v.Keywords, ', ') AS Keywords
FROM        Job as j
outer apply (
    select (Trim(value)) t,
        case when exists 
          (select * from list as list where lower(list.key)=lower(Trim(value))) 
        then Trim(value) end Keywords
        
    from String_Split(Description, ' ')
)v
group by J.ID, J.Code, J.Name, J.Description

Actual Output

  ID    Code   Name                Description                             Keywords
   2    002    CS            You will need to use python and C  .        Python, C  
   4    004    Programmer    Master's degree and C   required.            C  
   .     .         .                              .                    .
   .     .         .                              .                    .

If 'key' in list table has no space, it is captured well. However, it is not captured if keys have one space or more than one.

My designed output would be

ID   Code    Name                Description                                   Keywords
1    001    Engineer    This job requires Ms sql server.                       ms sql server
2    002    CS          You will need to use python and C  .                   Python, C   
3    003    Teacher     You must have certificated degree for this position.   Certificated degree
4    004    Programmer  Master's degree and C   required.                      Master's degree, C  
5    005    Designer  Experience more than 3 years in specific retail industry.specific retail industry
.        .         .                              . 
.        .         .                              .

Would someone please help me deal with keywords which have more than one space? If there is a better approach than string split by space, I would appreciate it as well!

CodePudding user response:

You can use patindex join to first find out whether a key-word exists, and then use your string_agg approach to combine them.

declare @MyTable table ([Description] nvarchar(max));
declare @MyList table ([Key] nvarchar(max));

insert into @MyTable ([Description])
values
('This job requires ms sql server.'),
('You will need to use python and C  .'),
('You must have certificated degree for this position.'),
('Master''s degree and C   required.'),
('Experience more than 3 years in specific retail industry.');

insert into @MyList ([Key])
values
('ms sql server'),
('Python'),
('Certificated degree'),
('C  '),
('Master''s degree'),
('specific retail industry');

select T.[Description], L.[Key]
from @MyTable T
inner join @MyList L on patindex('%' L.[Key] '%', T.[Description]) > -0;

Returns

Description Key
This job requires ms sql server. ms sql server
You will need to use python and C . Python
You must have certificated degree for this position. Certificated degree
You will need to use python and C . C
Master's degree and C required. C
Master's degree and C required. Master's degree
Experience more than 3 years in specific retail industry. specific retail industry
  • Related