Home > database >  How to extract pattern from string in SQL Server?
How to extract pattern from string in SQL Server?

Time:12-08

I have a host information table stored in a database of SQL Server, and the table has a text column storing a string formatted like Ansible inventory. See the text below for a sample item in the text column.

host-001.servers.company.com desc='Production Web Cache' env='Prod' patch_round='Beta' dc='Main' rhel_v='7.6' primary='[email protected]' secondary='[email protected]'

I need to extract certain attributes from the text column, e.g. extract desc='Production Web Cache', and get its value Production Web Cache. I want to use regular expression in a SQL query and hope to get some pointers.

Or, if you know another way of achieving this purpose, I will also highly appreciate your hints. Let me know if you need more clarification.

CodePudding user response:

A very similar approach to John. I use a JSON splitter firstly to get the data into parts, though this puts the value with the next header. I use CHARINDEX to find the end of the value, and then use that LEFT/STUFF to get the 2 values into their respective parts. Then I use LAG to get the actual header, rather that the next values header. Finally, I remove the surrounding quotes.

This follows on the assumptions from my comment:

  1. A domain is present at the start and followed by a space.
  2. Values cannot contain =.
  3. All values are enclosed in single quotes (') and all names are not

Note I don't include the domain in the results, but the SQL should give you more than enough to work out how to add it:

DECLARE @YourString nvarchar(4000) = N'host-001.servers.company.com desc=''Production Web Cache'' env=''Prod'' patch_round=''Beta'' dc=''Main'' rhel_v=''7.6'' primary=''[email protected]'' secondary=''[email protected]''';

WITH CTE AS(
    SELECT *,
           LAG(ContentHeader) OVER (ORDER BY [Key]) AS ActualHeader
    FROM (VALUES(@YourString))V(YourString)
         CROSS APPLY(VALUES(STUFF(@YourString, 1, CHARINDEX(N' ',@YourString),N'')))S(NewString)
         CROSS APPLY OPENJSON('["'   REPLACE(NewString,'=','","')   '"]')OJ
         CROSS APPLY(VALUES(NULLIF(CHARINDEX('''',OJ.[value],2),0)))CI(I)
         CROSS APPLY(VALUES(LEFT(OJ.[Value],CI.I),STUFF(OJ.[Value],1,ISNULL(CI.I 1,0),'')))P(ContentValue,ContentHeader))
SELECT ActualHeader AS Header,
       REPLACE([ContentValue],'''','') AS [Value]
FROM CTE
WHERE ActualHeader IS NOT NULL;

db<>fiddle

CodePudding user response:

A little ugly, but using a bit of JSON (to GTD the sequence) and the window function lead() over()

Example

Declare @YourTable table (ID int,SomeCol varchar(max))
 Insert Into @YourTable values
 (1,'host-001.servers.company.com desc=''Production Web Cache'' env=''Prod'' patch_round=''Beta'' dc=''Main'' rhel_v=''7.6'' primary=''[email protected]'' secondary=''[email protected]''')

  Select A.ID
       ,Host = left(SomeCol,charindex(' ',SomeCol ' '))
       ,B.*
  From  @YourTable A
  Cross Apply (

                Select Item =  ltrim(rtrim(right(Value,charindex(' ',reverse(Value) ' '))))
                      ,Value = ltrim(rtrim(replace(
                               IsNull(lead( left(Value,nullif(len(Value) 1-charindex(' ',reverse(Value) ' '),0)),1) over (order by [Key])
                                     ,lead(right(Value,charindex(' ',reverse(Value) ' ')),1) over (order by [key])
                                     ),'''','')))
                 From  OpenJSON( '["' replace(string_escape(SomeCol,'json'),'=','","') '"]' )

              ) B
 Where B.Value is not null
 

Results

ID  Host                             Item           Value
1   host-001.servers.company.com     desc           Production Web Cache 
1   host-001.servers.company.com     env            Prod 
1   host-001.servers.company.com     patch_round    Beta 
1   host-001.servers.company.com     dc             Main 
1   host-001.servers.company.com     rhel_v         7.6 
1   host-001.servers.company.com     primary        admin@company.com 
1   host-001.servers.company.com     secondary      manager@company.com

EDIT - Injected "HOST="

Declare @YourTable table (ID int,SomeCol varchar(max))
 Insert Into @YourTable values
 (1,'host-001.servers.company.com desc=''Production Web Cache'' env=''Prod'' patch_round=''Beta'' dc=''Main'' rhel_v=''7.6'' primary=''[email protected]'' secondary=''[email protected]''')

 
 Select A.ID
       ,B.*
  From  @YourTable A
  Cross Apply (
                Select Item =  ltrim(rtrim(right(Value,charindex(' ',reverse(Value) ' '))))
                      ,Value = ltrim(rtrim(replace(
                               IsNull(lead(left(Value,nullif(len(Value) 1-charindex(' ',reverse(Value) ' '),0)),1) over (order by [Key])
                                     ,lead(right(Value,charindex(' ',reverse(Value) ' ')),1) over (order by [key])
                                     ),'''','')))
                 From  OpenJSON( '["' replace(string_escape('host=' SomeCol,'json'),'=','","') '"]' )

              ) B
 Where B.Value is not null

Results

ID  Item        Value
1   host        host-001.servers.company.com
1   desc        Production Web Cache
1   env         Prod
1   patch_round Beta
1   dc          Main
1   rhel_v      7.6
1   primary     admin@company.com
1   secondary   manager@company.com

CodePudding user response:

Ideally, your data should be stored in separate columns. But if you are going to cram it into one column, at least use a recognized format such as XML or JSON.

Given that single quotes are valid XML attribute delimiters, you can transform this into XML and use XQuery.

It's not pretty, because the hostname value is not delimited

SELECT
  v3.n.value('@host','varchar(255)'),
  v3.n.value('@desc','varchar(1000)')
FROM t
CROSS APPLY(VALUES(
    CHARINDEX(' ', t.value)
)) v1(space)
CROSS APPLY(VALUES(
    CAST(
        '<x host='''  
        CASE WHEN v1.space = 0
            THEN t.value
            ELSE LEFT(t.value, v1.space - 1)   ''''   SUBSTRING(t.value, v1.space, LEN(t.value))
            END  
        ' />'
      AS xml)
)) v2(xml)
CROSS APPLY v2.xml.nodes('x') v3(n);

db<>fiddle

  • Related