Home > Enterprise >  Parse certain string values surrounded by brackets
Parse certain string values surrounded by brackets

Time:12-15

I'm trying to extract certain string values surrounded by special characters like brackets and create new columns using the extracted string values. For example, the original column looks like this.

         Column
Dec12 SKLEKF [[ABC]] <<JK>>
Jan13 KEJ [[FJ]] <<UHJ>>
FEB12 JKEJSKEJG [[GBC]] <<JUIDJ>>

I'm trying to create two new columns using the string values within the brackets. The new data table should look like this.

Code     ID
ABC      JK
FJ       UHJ
GBC      JUIDJ

I tried the following code but it is not working because the length of the string values that come before the brackets and within the brackets varies by row. Is there any way to parse the string values surrounded by the brackets regardless of their length?

select SUBSTRING(Column, 13, 3) AS Code
from table

Thank you.

CodePudding user response:

Using SUBSTRING operations:

SELECT
    col,
    SUBSTRING(col, CHARINDEX('[[', col)   2,
              CHARINDEX(']]', col) - CHARINDEX('[[', col) - 2) AS Code,
    SUBSTRING(col, CHARINDEX('<<', col)   2,
              CHARINDEX('>>', col) - CHARINDEX('<<', col) - 2) AS ID
FROM yourTable;

CodePudding user response:

Another method by using XML and XQuery.

The idea is to tokenize a string of tokens, and get individual tokens based on their position: [last()] and [last()-1].

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (seq INT identity PRIMARY KEY, tokens VARCHAR(100));
INSERT INTO @tbl (tokens) VALUES
('Dec12 SKLEKF [[ABC]] <<JK>>'),
('Jan13 KEJ [[FJ]] <<UHJ>>'),
('FEB12 JKEJSKEJG [[GBC]] <<JUIDJ>>');
-- DDL and sample data population, end

SELECT t.*
    , TRIM('[]' FROM c.value('(/root/r[last()-1]/text())[1]', 'VARCHAR(20)')) AS Code
    , TRIM('<>' FROM c.value('(/root/r[last()]/text())[1]', 'VARCHAR(20)')) AS ID
FROM @tbl AS t
    CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA['   
      REPLACE(tokens, SPACE(1), ']]></r><r><![CDATA[')   
      ']]></r></root>' AS XML)) AS t1(c);

Output

 ----- ----------------------------------- ------ ------- 
| seq |              tokens               | Code |  ID   |
 ----- ----------------------------------- ------ ------- 
|   1 | Dec12 SKLEKF [[ABC]] <<JK>>       | ABC  | JK    |
|   2 | Jan13 KEJ [[FJ]] <<UHJ>>          | FJ   | UHJ   |
|   3 | FEB12 JKEJSKEJG [[GBC]] <<JUIDJ>> | GBC  | JUIDJ |
 ----- ----------------------------------- ------ ------- 
  • Related