Home > database >  SSIS Use Derived Column to Dynamically Add Column(s) that do not Exists in Source
SSIS Use Derived Column to Dynamically Add Column(s) that do not Exists in Source

Time:10-23

I am trying to pull tables over from two source systems into a consolidated destination server. We have multiple business units that reside on separate source systems and for the most part have the same table structures, however one of the source systems was upgraded to a newer version and has some columns that the other sources do not have.

I have the data flow tasks right now setup to run inside a for each loop which finds and loops through each of my sources. I need the query to change based on the source system to include the missing columns. I'm hoping I might be able to do this dynamically by using derived column to add the columns if the column(s) in question doesn't exist and if it does exist, I want the package to ignore the derived column and move on.

I've tried also writing out my queries to variables but I can't quite figure out how to have the DFT choose a specific query variable based on the source connection.

EDIT: Table comparison example below:

--Updated Source
SELECT [ProductID]
  ,[ProductNumber]
  ,[ReorderPoint]
  ,[ListPrice]
  ,[SizeUnitMeasureCode]
  ,[WeightUnitMeasureCode]
  ,[Class]
  ,[Style]  
  ,[ProductSubcategoryID]
  ,[ProductModelID]
  ,[ModifiedDate]
FROM [Sales].[Product];

--Outdated Source
SELECT [ProductID]
  ,[ProductNumber]
  ,[ReorderPoint]
  ,[ListPrice]
  ,[SizeUnitMeasureCode]
  ,[WeightUnitMeasureCode]
  ,[Class]
  ,[Style]  
  ,NULL AS [ProductSubcategoryID]
  ,NULL AS [ProductModelID]
  ,[ModifiedDate]
FROM [Sales].[Product];

I want to be able to pull all columns including those that are missing from the outdated source. The columns noted with NULL AS are the missing columns in question.

CodePudding user response:

I don't like this solution, but it might work. As I mentioned, you might be able to construct a dynamic statement and then build the query from that. This is ugly but, like I said, SSIS expect consistent definitions so if youcan't give it that, there are hoops you must jump through.

This is untested as well, but hopefully will give you the idea.

DECLARE @ColumnList table (OrdinalPosition int IDENTITY(1,1),
                           ColumnName sysname,
                           ColumnDatatype sysname);


--The following datatypes are completely guessed
INSERT INTO @ColumnList
VALUES(N'ProductID',N'int'),
      (N'ProductNumber',N'int'),
      (N'ReorderPoint',N'int'),
      (N'ListPrice',N'decimal'),
      (N'SizeUnitMeasureCode',N'decimal'); --You get the idea

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13)   NCHAR(10);

DECLARE @Delimiter nvarchar(20) = N','   @CRLF   N'       ';

SELECT @SQL = N'SELECT '   
              STRING_AGG(ISNULL(QUOTENAME(c.[name]),N'CONVERT('   QUOTENAME(CL.ColumnDatatype)   N',NULL')   N') AS '   QUOTENAME(CL.ColumnName),@Delimiter) WITHIN GROUP (ORDER BY CL.OrdinalPosition)   @CRLF  
              N'FROM Sales.Product;'
FROM @ColumnList CL
     LEFT JOIN sys.columns c
          JOIN sys.tables t ON c.object_id = t.object_id
                           AND t.[name] = N'Product'
          JOIN sys.schemas s ON t.schema_id = s.schema_id
                            AND s.[name] = N'dbo'
                             ON CL.ColumnName = c.[name];

--PRINT @SQL; --Your best friend

EXEC sys.sp_executesql @SQL;

You should be able to use this as the definition for SSIS's source and it should create a dataset with all the columns you want, even if the table doesn't have it.

CodePudding user response:

The only way to do this semi-dynamically in SSIS is to use an Expression in the data flow source queries that replaces the column name with a literal null when connecting to systems that lack some of the columns. Or to you a Script Source in your data flow and replace the missing columns in code.

  • Related