Aggregate dynamic columns in SQL Server-CodePudding

I have a narrow table containing unique key and source data

Unique_Key	System
1	IT
1	ACCOUNTS
1	PAYROLL
2	IT
2	PAYROLL
3	IT
4	HR
5	PAYROLL

I want to be able to pick a system as a base - in this case IT - then create a dynamic SQL query where it counts:

distinct unique key in the chosen system
proportion of shared unique key with other systems. These systems could be dynamic and there are lot more than 4

I'm thinking of using dynamic SQL and PIVOT to first pick out all the system names outside of IT. Then using IT as a base, join to that table to get the information.

select distinct Unique_Key, System_Name
into #staging
from dbo.data
where System_Name <> 'IT'

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX);

SET @cols = STUFF((SELECT distinct ','   QUOTENAME(System_Name) 
            FROM #staging
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT Unique_Key, '   @cols   ' into dbo.temp from 
            (
                select Unique_Key, System_Name
                from #staging
           ) x
            pivot 
            (
                 count(System_Name)
                for System_Name in ('   @cols   ')
            ) p '

execute(@query)

  select *
  from 
  (
    select distinct Unique_Key
  from dbo.data
  where System_Name = 'IT'
  ) a
  left join dbo.temp b
  on a.Unique_Key = b.Unique_Key

So the resulting table is:

Unique_Key	PAYROLL	ACCOUNTS
1	1	1
2	1	0
3	0	0

What I want is one step further:

Distinct Count IT Key	PAYROLL	ACCOUNTS	HR
3	67%	33%	0%

I can do a simple join with specific case when/sum statement but wondering if there's a way to do it dynamically so I don't need to specify every system name.

Appreciate any tips/hints.

CodePudding user response：

You can try to use dynamic SQL as below, I would use condition aggregate function get pivot value then we might add OUTER JOIN or EXISTS condition in dynamic SQL.

I would use sp_executesql instead of exec to avoid sql-injection.

DECLARE @System_Name NVARCHAR(50) = 'IT' 
DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX),
    @parameter AS NVARCHAR(MAX);
    
SET @parameter = '@System_Name NVARCHAR(50)'

select DISTINCT System_Name
into #staging
from dbo.data t1
WHERE t1.System_Name <> @System_Name



SET @cols = STUFF((SELECT distinct ', SUM(IIF(System_Name = '''  System_Name ''',1,0))  * 100.0 / SUM(IIF(System_Name = @System_Name,0,1)) '   QUOTENAME(System_Name) 
            FROM #staging
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')
        
set @query = 'SELECT SUM(IIF(System_Name = @System_Name,0,1)) [Distinct Count IT Key], '   @cols   ' from  dbo.data t1
 WHERE EXISTS (
  SELECT 1 
  FROM dbo.data tt 
  WHERE tt.Unique_Key = t1.Unique_Key
  AND tt.System_Name = @System_Name
 ) '
 
EXECUTE sp_executesql @query, @parameter,  @System_Name

sqlfiddle

CodePudding user response：

This solution changes the aggregation function of the PIVOT itself.

First, let's add a column [has_it] to #staging that keeps track of whether each Unique_Key has an IT row:

select Unique_Key, System_Name, case when exists(select 1 from data d2 where d2.Unique_Key=d1.Unique_Key and d2.System_Name='IT') then 1 else 0 end as has_it
into #staging
from data d1
where System_Name <> 'IT'
group by Unique_Key, System_Name

Now, the per-System aggregation (sum) of this column divided by the final total unique keys needed (example case=3) returns the requests numbers. Change the PIVOT to the following and it's ready as-is, without further queries:

set @query = '  select * 
    from 
        (
            select System_Name,cnt as [Distinct Count IT Key],has_it*1.0/cnt as divcnt
            from #staging
            cross join
            (   
                select count(distinct Unique_Key) as cnt
                from dbo.data
                where System_Name = ''IT''
            )y
        ) x
        pivot 
        (
                sum(divcnt)
            for System_Name in ('   @cols   ')
        ) p'

CodePudding user response：

When writing Dynamic Query, you start off with a non-dynamic query. Make sure you gets the result of the query is correct before you convert to dynamic query.

For the result that you required, the query will be

with cte as
(
    select it.Unique_Key, ot.System_Name
    from   data it
           left join data ot on it.Unique_Key = ot.Unique_Key
                            and ot.System_Name <> 'IT'
    where  it.System_Name = 'IT'
)
select  [ITKey]    = count(distinct Unique_Key), 
        [ACCOUNTS] = count(case when System_Name = 'ACCOUNTS' then Unique_Key end) * 100.0 
                   / count(distinct Unique_Key),
        [HR]       = count(case when System_Name = 'HR'       then Unique_Key end) * 100.0 
                   / count(distinct Unique_Key),
        [PAYROLL]  = count(case when System_Name = 'PAYROLL'  then Unique_Key end) * 100.0 
                   / count(distinct Unique_Key)
from   cte;

Once you get the result correct, it is not that difficult to convert to dynamic query. Use string_agg() or for xml path for those repeated rows

declare @sql nvarchar(max);

; with cte as
(
    select distinct System_Name
    from   data
    where  System_Name <> 'IT'
)
select  @sql = string_agg(sql1   ' / '   sql2, ','   char(13))
from    cte
        cross apply
        (
            select sql1 = char(9)   quotename(System_Name)   ' = '
                          'count(case when System_Name = '''   System_Name   ''' then Unique_Key end) * 100.0 ',
                   sql2 = 'count(distinct Unique_Key)'
        ) a

select  @sql = 'with cte as'   char(13)
               '('   char(13)
               '    select it.Unique_Key, ot.System_Name'   char(13)
               '    from   data it'   char(13)
               '           left join data ot on it.Unique_Key = ot.Unique_Key'   char(13)
               '                            and ot.System_Name <> ''IT'''   char(13)
               '    where  it.System_Name = ''IT'''   char(13)
               ')'   char(13)
               'select [ITKey] = count(distinct Unique_Key), '   char(13)
               @sql   char(13)
               'from   cte;'   char(13)

print @sql;

exec sp_executesql @sql;

db<>fiddle demo