I need to be able to PIVOT / map a text string (Table1.TextString column) in the following table:
CardType | CardCode | TextString | ColumnName | MaxLengthOfString |
---|---|---|---|---|
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Type | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Class | 4 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | PackSize | 8 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | InvPack | 3 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Fill | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | CalcAmount | 8 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | CostCode | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | UomCode | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Uom | 5 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Fill1 | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | FoodStamp | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Fill2 | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Tax | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Fill3 | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Disc | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Fill4 | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | NoStock | 2 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Fill5 | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | CalcPack2 | 7 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | InvFlag | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Fill6 | 1 |
A | 2 | I101G2.2 OZ 001 0002200 L Y 0010000 | Fill7 | 99 |
Editor note: The actual TextString value above is as follows with length = 70 (with reference rule):
I101G2.2 OZ 001 0002200 L Y 0010000
---- ----1---- ----2---- ----3---- ----4---- ----5---- ----6---- ----7
to create an output like this using Table1.ColumnName and
Table1.MaxLengthOfString columns:
Type | Class | PackSize | InvPack | CalcAmount | CostCode | UomCode | Uom | FoodStamp | Tax | Disc | NoStock | CalcPack2 | InvFlag |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
I | 101G | 2.2 OZ | 001 | 0002200 | L | Y | 0010000 |
So, the 'Type' column in Table2 is extracted from Table1.TextString column starting from the 1st character and is 1 character in length (Table1.MaxLengthOfString). Table2.Class column starts at the 2nd position (1st position 1 character in length) and is 4 characters in length. The Table2.PackSize starts at 6th position (prior position of 2 prior length of 4) and is 8 character is length, Table2.InvPack starts at 14th position (prior position of 6 prior length of 8), and so on....
The output should replace the 'NULLS' in Table2 with space character equal to the length of the MaxLengthOfString, so that all rows in Table2 contain a string equal in length to the MaxLengthOfString no matter the actual number of characters. For example, if the MaxLengthOfString is 8 and the are 4 characters, the string will be padded with 4 'space' characters to the end so that the length of the string totals 8.
I hope this makes sense. My project involves converting a string of characters generated by a COBOL program into a readable table.
CodePudding user response:
Below is a SQL script that will construct and execute dynamic SQL to extract the fields. It defines several SQL template snippets up from and then progressively builds up the final SQL to be executed using a combination of REPLACE() and STRING_AGG() functions
-- Define SQL templates to be used later
DECLARE @SelectItemTemplate VARCHAR(MAX) =
' SUBSTRING(PaddedTextString, <Start>, <Length>) AS <ColumnName>'
DECLARE @SelectItemSeparator VARCHAR(20) = ',
'
DECLARE @SqlTemplate VARCHAR(MAX) = '
SELECT T.CardType, T.CardCode,
<Selectlist>
FROM Table1 T
CROSS APPLY (
SELECT ISNULL(TextString, '''') SPACE(<PadLength>) AS PaddedTextString
) PTS
'
-- Construct select list to map each substring to a column.
-- Exckude "fill" columns.
DECLARE @SelectList VARCHAR(MAX) = (
SELECT STRING_AGG(SI.SelectItem, @SelectItemSeparator)
WITHIN GROUP(ORDER BY C.Seq)
FROM (
SELECT *,
SUM(C.MaxLengthOfString) OVER(ORDER BY C.Seq) - MaxLengthOfString 1 AS StartPos
FROM ColumnConfig C
) C
CROSS APPLY (
SELECT REPLACE(REPLACE(REPLACE(
@SelectItemTemplate
, '<Start>', CAST(C.StartPos AS VARCHAR))
, '<Length>', CAST(C.MaxLengthOfString AS VARCHAR))
, '<ColumnName>', QUOTENAME(C.ColumnName))
AS SelectItem
) SI
WHERE C.ColumnName NOT LIKE 'Fill%'
)
-- Calculate length needed for padding to ensure all SUBSTRINGs are covered.
DECLARE @PadLength INT = (SELECT SUM(MaxLengthOfString) FROM ColumnConfig)
-- Build final SQL
DECLARE @Sql NVARCHAR(MAX) = REPLACE(REPLACE(
@SqlTemplate
, '<SelectList>', @SelectList)
, '<PadLength>', @PadLength)
-- Print and execute
PRINT @Sql
EXEC (@Sql)
Generated dynamic SQL:
SELECT T.CardType, T.CardCode,
SUBSTRING(PaddedTextString, 1, 1) AS [Type],
SUBSTRING(PaddedTextString, 2, 4) AS [Class],
SUBSTRING(PaddedTextString, 6, 8) AS [PackSize],
SUBSTRING(PaddedTextString, 14, 3) AS [InvPack],
SUBSTRING(PaddedTextString, 18, 8) AS [CalcAmount],
SUBSTRING(PaddedTextString, 26, 1) AS [CostCode],
SUBSTRING(PaddedTextString, 27, 1) AS [UomCode],
SUBSTRING(PaddedTextString, 28, 5) AS [Uom],
SUBSTRING(PaddedTextString, 34, 1) AS [FoodStamp],
SUBSTRING(PaddedTextString, 36, 1) AS [Tax],
SUBSTRING(PaddedTextString, 38, 1) AS [Disc],
SUBSTRING(PaddedTextString, 40, 2) AS [NoStock],
SUBSTRING(PaddedTextString, 43, 7) AS [CalcPack2],
SUBSTRING(PaddedTextString, 50, 1) AS [InvFlag]
FROM Table1 T
CROSS APPLY (
SELECT ISNULL(TextString, '') SPACE(150) AS PaddedTextString
) PTS
Results:
CardType | CardCode | Type | Class | PackSize | InvPack | CalcAmount | CostCode | UomCode | Uom | FoodStamp | Tax | Disc | NoStock | CalcPack2 | InvFlag |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | 2 | I | 101G | 2.2 OZ | 001 | 0002200 | L | Y | 0010000 |
For the purposes of this exercise, I split your original posted table into two tables - one for the column configuration and one for the data. I also added a sequence number to the column config to ensure a well-defined order.
A running SUM() of the lengths is used to calculate the start of each field. A total length is calculated and used to pad the input TextString to ensure that all generated SUBSTRING() have sufficient data.
Note that the results will have trailing spaces. typically you would want to trim these spaces before storing them in the database. If you chose to do so, you can add an RTRIM() to the select item template. If you need to reconstruct padded values to transfer back to your COBOL application, you can use something like SELECT LEFT(TrimmedValue SPACE(length), length) as PaddedValue
.
I expect that you will still need to modify the final SQL to perform an INSERT into your target table.
See this db<>fiddle for a working select.