Home > Blockchain >  SQL: Return all distinct substrings of a specified string in column
SQL: Return all distinct substrings of a specified string in column

Time:05-14

I am querying metadata on Snowflake which contains a column of queries:

select query_name
from infotech.log_analytics.metadata
query_name
SELECT * FROM SYSIBM.SQLCOLUMNS
SELECT * FROM SYSIBM.SYSDUMMY1
SELECT CCID, CCCODE FROM V820.ZCC ZCC JOIN V820.ZCI ZCI ON ZCC.CCID = ZCI.CCID WHERE trim(CCCODE) NOT IN ('89090','89089','89087')
SELECT * FROM V820.IIM

I want to adjust this query to return the substrings which contain distinct table names, i.e.

query_name
SYSIBM.SQLCOLUMNS
SYSIBM.SYSDUMMY1
V820.ZCC V820.ZCI
V820.IIM

I've looked around and I think a function or some combo of SUBSTRING() CHAR_INDEX() inside of a CTE could accomplish this, I tried

Select 
SUBSTRING(query_name,CHARINDEX('from', lower(query_name)),LEN(query_name)-CHARINDEX('from', lower(query_name))) query_name
from infotech.log_analytics.metadata
query_name
From SYSIBM.SQLCOLUMNS
From SYSIBM.SYSDUMMY1
From V820.ZCC ZCC JOIN V820.ZCI ZCI ON ZCC.CCID = ZCI.CCID WHERE trim(CCCODE) NOT IN ('89090','89089','89087')
V820.IIM

how can I adjust this to get what I want?

CodePudding user response:

This wouldnt work with queries involving more than one table, neither if you have tables whose names cointains space character ([Example table], for example). Anyway, following your asumptions, you want to get the words between the 'FROM ' and the first space after that (please notice space left after from). Here is one possible approach. I use a query against a subquery:

Subquery: here we get the text after 'FROM ' from your table:

SELECT 
        SUBSTRING(  
                    query_name, 
                    CHARINDEX('FROM ', UPPER(query_name))   1, 
                    ABS(LEN(query_name) - CHARINDEX('FROM ', UPPER(query_name)))
                ) AS AUXILIARFIELD
     FROM 
        infotech.log_analytics.metadata
    
    

Once we have this (should return the following):

SYSIBM.SQLCOLUMNS

SYSIBM.SYSDUMMY1 V820.ZCC ZCC JOIN V820.ZCI ZCI ON ZCC.CCID = ZCI.CCID WHERE trim(CCCODE) NOT IN ('89090','89089','89087')

V820.IIM

what you need next is to get the text from the start to the first position where there is an space character, this is easier (please notice the AUX subquery is the one explained above):

SELECT  
    SUBSTRING(AUXILIARFIELD, 1, CHARINDEX(' ', AUXILIARFIELD)) AS EXPECTEDRESULT,
    FROM
        (
        SELECT 
                SUBSTRING(  
                            query_name, 
                            CHARINDEX('FROM ', UPPER(query_name))   1, 
                            ABS(LEN(query_name) - CHARINDEX('FROM ', UPPER(query_name)))
                        ) AS AUXILIARFIELD
             FROM 
                infotech.log_analytics.metadata
        ) AS AUXILIARQUERY
    

Please have a look at it and take into account I may have missed index exception cases if no space is after the table name. I just checked it against a different table to try it and later used your table and field names.

CodePudding user response:

Try this query:

Select 
SUBSTRING(query_name,CHARINDEX('from ', lower(query_name))   5,CHARINDEX(' ', lower(query_name   ' '), CHARINDEX('from ', lower(query_name))   5) - CHARINDEX('from ', lower(query_name)) - 5) query_name
from infotech.log_analytics.metadata;

DBFiddle Demo Link

CodePudding user response:

Here is a option that will display the table after keyword join:

    Select case when CHARINDEX('JOIN',query_name)=0 
               then
                   SUBSTRING(query_name,CHARINDEX('from ', lower(query_name))   5,CHARINDEX(' ', lower(query_name   ' '), CHARINDEX('from ', lower(query_name))   5) - CHARINDEX('from ', lower(query_name)) - 5)
               else 
                   SUBSTRING(query_name,CHARINDEX('from ', lower(query_name))   5,CHARINDEX(' ', lower(query_name   ' '), CHARINDEX('from ', lower(query_name))   5) - CHARINDEX('from ', lower(query_name)) - 5) ' ' 
                   SUBSTRING(query_name,CHARINDEX('join ', lower(query_name))   5,CHARINDEX(' ', lower(query_name   ' '), CHARINDEX('join ', lower(query_name))   5) - CHARINDEX('join ', lower(query_name)) - 5)
               end
    from Table1

Result:

SYSIBM.SQLCOLUMNS
SYSIBM.SYSDUMMY1
V820.ZCC V820.ZCI
V820.IIM

dbfiddle

  • Related