Home > Back-end >  Parsing through free text field in SQL
Parsing through free text field in SQL

Time:01-19

I have a field in my table that has different types of data in it which can be translated into different columns or rows.

Free Text Field TypeID
{Visit Info}[<Visit Date>2022-01-01</Visit Date><Visit Type>Clinical</Visit Type>]{/Visit Info}{Costs}[<Laboratory>30.91</Laboratory><Encounter>15.00</Encounter>]{/Costs} 1
{Index Events}[<Date>2022-03-04</Date><Diagnosis>I10</Diagnosis>]{/Index Events} 2
{Visit Info}[<Visit Date>2022-10-12</Visit Date><Visit Type>Administrative</Visit Type>]{/Visit Info}{Costs}[<Consultation>25.00</Consultation>]{/Costs} 1

The idea is that the data is enclosed in {Category}, <Subcategory> and then the data related to the <Subcategory>. I need to write a query that returns the data like in the following result set:

TypeID Category Subcategory Result
1 Visit Info Visit Date 2022-01-01
1 Visit Info Visit Type Clinical
1 Costs Laboratory 30.91
1 Costs Encounter 15.00
2 Index Events Date 2022-03-04
2 Index Events Diagnosis I10
1 Visit Info Visit Date 2022-10-12
1 Visit Info Visit Type Administrative
1 Costs Consultation 25.00

I'd appreciate it if I could be pointed to the right direction in terms of what functions to use.

I'm using SSMS on SQL Server 2016

CodePudding user response:

If your data looked like this then it would be valid XML and would be easy to parse with the XML parser

Free Text Field TypeID
<Visit Info><Visit Date>2022-01-01</Visit Date><Visit Type>Clinical</Visit Type></Visit Info><Costs><Laboratory>30.91</Laboratory><Encounter>15.00</Encounter></Costs> 1
<Index Events><Date>2022-03-04</Date><Diagnosis>I10</Diagnosis></Index Events> 2
<Visit Info><Visit Date>2022-10-12</Visit Date><Visit Type>Administrative</Visit Type></Visit Info><Costs><Consultation>25.00</Consultation></Costs> 1

CodePudding user response:

You can do a lot with substring, like:

WITH cte AS (
   SELECt 'Visit Info' as category, 'Visit Date' as subcategory  UNION
   SELECt 'Costs' as category, 'Laboratory' as subcategory 
  )
SELECT 
   typeid,
   category,
   subcategory,
   SUBSTRING(ss,y1 len(subcategory) 2,y2-(y1 len(subcategory) 2)) as value
FROM (  
   SELECT 
      typeid,
      category,
      subcategory,
      SUBSTRING(free,x1 len(category) 2,x2-(x1 len(category) 2)) as ss,
      CHARINDEX(CONCAT('<',subcategory,'>'),SUBSTRING(free,x1 (len(category) 2),x2-x1 len(category) 2 1)) as y1,
      CHARINDEX(CONCAT('</',subcategory,'>'),SUBSTRING(free,x1 (len(category) 2),x2-x1 len(category) 2 1)) as y2
   FROM (
      SELECT 
         typeid,
         category,
         free,
         subcategory,
         CHARINDEX(CONCAT('{',category,'}'),free) as x1,
         CHARINDEX(CONCAT('{/',category,'}'),free) as x2
      FROM test
      CROSS APPLY cte )x
   WHERE x1<>0 and x2>x1
) y
WHERE y1<>0 and y2>y1 ;

The other category and subcategory's can be done in the same way

Output of this:

typeid category subcategory value
1 Costs Laboratory 30.91
1 Visit Info Visit Date 2022-01-01
1 Visit Info Visit Date 2022-10-12

see: DBFIDDLE

  • Related