I have a narrow table containing unique key and source data
Unique_Key | System |
---|---|
1 | IT |
1 | ACCOUNTS |
1 | PAYROLL |
2 | IT |
2 | PAYROLL |
3 | IT |
4 | HR |
5 | PAYROLL |
I want to be able to pick a system as a base - in this case IT - then create a dynamic SQL query where it counts:
- distinct unique key in the chosen system
- proportion of shared unique key with other systems. These systems could be dynamic and there are lot more than 4
I'm thinking of using dynamic SQL and PIVOT to first pick out all the system names outside of IT. Then using IT as a base, join to that table to get the information.
select distinct Unique_Key, System_Name
into #staging
from dbo.data
where System_Name <> 'IT'
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX);
SET @cols = STUFF((SELECT distinct ',' QUOTENAME(System_Name)
FROM #staging
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT Unique_Key, ' @cols ' into dbo.temp from
(
select Unique_Key, System_Name
from #staging
) x
pivot
(
count(System_Name)
for System_Name in (' @cols ')
) p '
execute(@query)
select *
from
(
select distinct Unique_Key
from dbo.data
where System_Name = 'IT'
) a
left join dbo.temp b
on a.Unique_Key = b.Unique_Key
So the resulting table is:
Unique_Key | PAYROLL | ACCOUNTS | HR |
---|---|---|---|
1 | 1 | 1 | 0 |
2 | 1 | 0 | 0 |
3 | 0 | 0 | 0 |
What I want is one step further:
Distinct Count IT Key | PAYROLL | ACCOUNTS | HR |
---|---|---|---|
3 | 67% | 33% | 0% |
I can do a simple join with specific case when/sum statement but wondering if there's a way to do it dynamically so I don't need to specify every system name.
Appreciate any tips/hints.
CodePudding user response:
You can try to use dynamic SQL as below, I would use condition aggregate function get pivot value then we might add OUTER JOIN
or EXISTS condition in dynamic SQL.
I would use sp_executesql instead of exec
to avoid sql-injection.
DECLARE @System_Name NVARCHAR(50) = 'IT'
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX),
@parameter AS NVARCHAR(MAX);
SET @parameter = '@System_Name NVARCHAR(50)'
select DISTINCT System_Name
into #staging
from dbo.data t1
WHERE t1.System_Name <> @System_Name
SET @cols = STUFF((SELECT distinct ', SUM(IIF(System_Name = ''' System_Name ''',1,0)) * 100.0 / SUM(IIF(System_Name = @System_Name,0,1)) ' QUOTENAME(System_Name)
FROM #staging
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT SUM(IIF(System_Name = @System_Name,0,1)) [Distinct Count IT Key], ' @cols ' from dbo.data t1
WHERE EXISTS (
SELECT 1
FROM dbo.data tt
WHERE tt.Unique_Key = t1.Unique_Key
AND tt.System_Name = @System_Name
) '
EXECUTE sp_executesql @query, @parameter, @System_Name
CodePudding user response:
This solution changes the aggregation function of the PIVOT itself.
First, let's add a column [has_it] to #staging that keeps track of whether each Unique_Key has an IT row:
select Unique_Key, System_Name, case when exists(select 1 from data d2 where d2.Unique_Key=d1.Unique_Key and d2.System_Name='IT') then 1 else 0 end as has_it
into #staging
from data d1
where System_Name <> 'IT'
group by Unique_Key, System_Name
Now, the per-System aggregation (sum) of this column divided by the final total unique keys needed (example case=3) returns the requests numbers. Change the PIVOT to the following and it's ready as-is, without further queries:
set @query = ' select *
from
(
select System_Name,cnt as [Distinct Count IT Key],has_it*1.0/cnt as divcnt
from #staging
cross join
(
select count(distinct Unique_Key) as cnt
from dbo.data
where System_Name = ''IT''
)y
) x
pivot
(
sum(divcnt)
for System_Name in (' @cols ')
) p'
CodePudding user response:
When writing Dynamic Query, you start off with a non-dynamic query. Make sure you gets the result of the query is correct before you convert to dynamic query.
For the result that you required, the query will be
with cte as
(
select it.Unique_Key, ot.System_Name
from data it
left join data ot on it.Unique_Key = ot.Unique_Key
and ot.System_Name <> 'IT'
where it.System_Name = 'IT'
)
select [ITKey] = count(distinct Unique_Key),
[ACCOUNTS] = count(case when System_Name = 'ACCOUNTS' then Unique_Key end) * 100.0
/ count(distinct Unique_Key),
[HR] = count(case when System_Name = 'HR' then Unique_Key end) * 100.0
/ count(distinct Unique_Key),
[PAYROLL] = count(case when System_Name = 'PAYROLL' then Unique_Key end) * 100.0
/ count(distinct Unique_Key)
from cte;
Once you get the result correct, it is not that difficult to convert to dynamic query. Use string_agg() or for xml path for those repeated rows
declare @sql nvarchar(max);
; with cte as
(
select distinct System_Name
from data
where System_Name <> 'IT'
)
select @sql = string_agg(sql1 ' / ' sql2, ',' char(13))
from cte
cross apply
(
select sql1 = char(9) quotename(System_Name) ' = '
'count(case when System_Name = ''' System_Name ''' then Unique_Key end) * 100.0 ',
sql2 = 'count(distinct Unique_Key)'
) a
select @sql = 'with cte as' char(13)
'(' char(13)
' select it.Unique_Key, ot.System_Name' char(13)
' from data it' char(13)
' left join data ot on it.Unique_Key = ot.Unique_Key' char(13)
' and ot.System_Name <> ''IT''' char(13)
' where it.System_Name = ''IT''' char(13)
')' char(13)
'select [ITKey] = count(distinct Unique_Key), ' char(13)
@sql char(13)
'from cte;' char(13)
print @sql;
exec sp_executesql @sql;