How to convert vertical data to horizontal dynamically without using Cursor or Pivot-CodePudding

Currently I am trying to convert vertical data structure to horizontal data dynamically by implementing a cursor in the stored procedure. However, the execution time takes too long. I am trying to find a way to optimise this approach. What is the better way to use other than Cursor? Note: PIVOT cant be used because of scattered data mapping. This is my current code with sample data and expected output:

What I have done so far: There are three temporary tables #tablestructure, #raw and #dsid. #Tablestruture contains all the columns under destfieldname which are needed to create a table. Then once it is created, we need to take the contents from #raw and insert it into the new table which has been created based on the row and column number

#tablestructure

desttablename	destfieldname	datatype
*	subjectnumber	int
sample	personame	nvarchar(20)
sample	personlocation	nvarchar(20)

#raw table

rownumber	columnnumber	fieldname	contents
1	1	subjectnumber	132516352
1	2	personname	Alex
2	1	subjectnumber	132516353
2	3	personlocation	Canada
1	3	personlocation	Australia
2	2	personname	John

#DSID Table

projectid	respondentfieldname	debug	fileincludedheaders
123	subjectnumber	NULL	1

Expected output

Subjectnumber	PersonName	Personlocation
132516352	Alex	Australia
132516353	John	Canada

declare createSQL cursor for 
select distinct 'drop table TestTmp.dbo.'   t.destname  as droptableSQL,
         'CREATE TABLE TestTmp.dbo.'   t.destname      ' ('   stuff((
            select ',['   destfieldname   '] '   f.datatype from #tableStructure f 
            where (f.desttablename in (t.desttablename, '*')) 
            group by destfieldname, f.datatype) 
    for xml path ('')),1,1,'')   ', [sequenceID] int, [subsequenceID] int)'  as createTableSQL,
'insert into VectorNormalizerTmp.dbo.'   t.desttablename   @tablesuffix   '(['   replace(replace(respondentidfieldname,']',''),'[','')   '],sequenceid, subsequenceid) 
SELECT a.contents, b.sequenceid, b.subsequenceid from (select contents from #raw where fieldname = '''   respondentidfieldname   ''') a 
CROSS JOIN (select sequenceid, subsequenceid from #rowUniques where desttablename = '''   desttablename  ''') b ' as appendEmptyRowsSQL,
'create index idx'   t.desttablename   @tablesuffix   ' on VectorNormalizerTmp.dbo.'   desttablename   @tablesuffix   '(['   respondentidfieldname   '],sequenceid, subsequenceid)' as idxSQL

from (select distinct desttablename from #tablestructure where desttablename = 'Screener') t
cross join (select respondentidfieldname, debug from #dsid) d


declare @sqltorun nvarchar(max);
declare @sqltorun2 nvarchar(max);
declare @sqltorun3 nvarchar(max);
declare @sqltorun4 nvarchar(max);

open createSQL
fetch next from createSQL into @sqltorun, @sqltorun2, @sqltorun3, @sqltorun4
while @@FETCH_STATUS = 0 
begin
    begin try
    --print @sqltorun
    EXECUTE sp_executesql @sqltorun
    end try
    begin catch
    --print '!!ERROR: '   ERROR_MESSAGE()   
    end catch
    print @sqltorun2
    EXECUTE sp_executesql @sqltorun2
    print @sqltorun3
    EXECUTE sp_executesql @sqltorun3
    --print @sqltorun4
    begin try
    EXECUTE sp_executesql @sqltorun4
    end try
    begin catch
    --print '!!ERROR: '   ERROR_MESSAGE()   
    end catch
    fetch next from createSQL into @sqltorun, @sqltorun2, @sqltorun3, @sqltorun4
end
close createSQL
deallocate createSQL

CodePudding user response：

I dont quite understand the general idea of this cursor, but you can't create and fill multiple talbes without some sort of circle, so cursor is ok. But you could accelerate it by optimizing your appendEmptyRowsSQL query. It seems like #raw table is huge, so it will be better to have some sort of fieldname dimension, which ID you could assign to #raw table and then creating clustered index on it. Also if #raw table is more than 2 millions rows on one fieldname, you may be need to insert data in a circle with batchsize about 1 million.

CodePudding user response：

Honestly, this question is unclear, I've guessed what you are after, as it was never truly explained. This creates your 2 tables, * and sample (on a hard coded schema dbo) and INSERTs the data from table raw into the respective tables. It also assumes you are on a fully supported version of SQL Server.

This is, however, a complete mess:

SELECT *
INTO dbo.tablestructure
FROM (VALUES(N'*',N'subjectnumber',N'int'),
            (N'sample',N'personname',N'nvarchar(20)'),
            (N'sample',N'personlocation',N'nvarchar(20)'))V(desttablename,destfieldname,datatype);

SELECT *
INTO dbo.raw
FROM (VALUES(1,1,N'subjectnumber',N'132516352'),
            (1,2,N'personname',N'Alex'),
            (2,1,N'subjectnumber',N'132516353'),
            (2,3,N'personlocation',N'Canada'),
            (1,3,N'personlocation',N'Australia'),
            (2,2,N'personname',N'John'))V(rownumber,columnnumber,fieldname,contents);
GO

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13)   NCHAR(10);

WITH Tables AS(
    SELECT ts.desttablename,
           N'CREATE TABLE dbo.'   QUOTENAME(ts.desttablename)   N' ('   STRING_AGG(QUOTENAME(ts.destfieldname)   N' '   QUOTENAME(V.datatype)   V.precisionscale,',') WITHIN GROUP (ORDER BY ts.destfieldname)   N');' AS CreateStatement
    FROM dbo.tablestructure ts
         --I don't trust data in a table like this, so I'm going to check it's "good"
         --This won't cause an error it's it's bad, but it'll always avoid injection
         --If you want an error, you would need to checl the validity first
         CROSS APPLY (VALUES(LEFT(ts.datatype,CHARINDEX('(',datatype '(')-1),STUFF(ts.datatype,1,CHARINDEX('(',datatype '(')-1,'')))V(datatype,precisionscale)
         JOIN sys.types t ON V.datatype = t.name --Make sure that the datatype is valid
    WHERE V.precisionscale NOT LIKE '%[^(),0-9 ]%' --Only contains parenthesis, numbers, commas and spaces
       OR V.precisionscale = '(MAX)' --Or the scale is MAX
    GROUP BY ts.desttablename)
SELECT @SQL = STRING_AGG(CreateStatement,@CRLF) 
FROM Tables T;

EXEC sys.sp_executesql @SQL; 
GO

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13)   NCHAR(10);

WITH InsertClause AS (
    SELECT ts.desttablename,
           N'INSERT INTO dbo.'   QUOTENAME(desttablename)   N' ('   STRING_AGG(QUOTENAME(destfieldname)   N'',',') WITHIN GROUP (ORDER BY destfieldname)   N')' AS InsertClause
    FROM dbo.tablestructure ts
    WHERE EXISTS (SELECT 1
                  FROM dbo.raw r
                  WHERE ts.destfieldname = r.fieldname)
    GROUP BY ts.desttablename),
RowValues AS(
    SELECT IC.InsertClause,
           N'('   STRING_AGG(N'N'''   REPLACE(r.contents,'''','''''')   N'''',',') WITHIN GROUP (ORDER BY r.fieldname)   N')' AS RowInsert
    FROM InsertClause IC
         JOIN dbo.tablestructure ts ON IC.desttablename = ts.desttablename
         JOIN dbo.raw r ON ts.destfieldname = r.fieldname
    GROUP BY IC.InsertClause,
             r.rownumber),
InsertStatements AS(
    SELECT InsertClause   @CRLF  
           N'VALUES '   STRING_AGG(RowInsert,N',')   N';' AS FullInsertStatement
    FROM RowValues
    GROUP BY InsertClause)
SELECT @SQL = STRING_AGG(FullInsertStatement,@CRLF)
FROM InsertStatements;

PRINT @SQL;

EXEC sys.sp_executesql @SQL;
GO

SELECT *
FROM dbo.[*];

SELECT *
FROm dbo.[sample];

GO

DROP TABLE dbo.tablestructure;
DROP TABLe dbo.raw;
GO

DROP TABLE IF EXISTS dbo.[*]
DROP TABLE IF EXISTS dbo.sample;

db<>fiddle