Home > front end >  How to generate a PIVOT output based on the position of a substring in a string as well as its lengt
How to generate a PIVOT output based on the position of a substring in a string as well as its lengt

Time:01-06

I need to be able to PIVOT / map a text string (Table1.TextString column) in the following table:

CardType CardCode TextString ColumnName MaxLengthOfString
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Type 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Class 4
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 PackSize 8
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 InvPack 3
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Fill 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 CalcAmount 8
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 CostCode 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 UomCode 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Uom 5
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Fill1 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 FoodStamp 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Fill2 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Tax 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Fill3 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Disc 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Fill4 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 NoStock 2
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Fill5 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 CalcPack2 7
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 InvFlag 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Fill6 1
A 2 I101G2.2 OZ 001 0002200 L Y 0010000 Fill7 99

Editor note: The actual TextString value above is as follows with length = 70 (with reference rule):

I101G2.2 OZ  001 0002200 L       Y        0010000                     
---- ----1---- ----2---- ----3---- ----4---- ----5---- ----6---- ----7

to create an output like this using Table1.ColumnName and

Table1.MaxLengthOfString columns:

Type Class PackSize InvPack CalcAmount CostCode UomCode Uom FoodStamp Tax Disc NoStock CalcPack2 InvFlag
I 101G 2.2 OZ 001 0002200 L Y 0010000

So, the 'Type' column in Table2 is extracted from Table1.TextString column starting from the 1st character and is 1 character in length (Table1.MaxLengthOfString). Table2.Class column starts at the 2nd position (1st position 1 character in length) and is 4 characters in length. The Table2.PackSize starts at 6th position (prior position of 2 prior length of 4) and is 8 character is length, Table2.InvPack starts at 14th position (prior position of 6 prior length of 8), and so on....

The output should replace the 'NULLS' in Table2 with space character equal to the length of the MaxLengthOfString, so that all rows in Table2 contain a string equal in length to the MaxLengthOfString no matter the actual number of characters. For example, if the MaxLengthOfString is 8 and the are 4 characters, the string will be padded with 4 'space' characters to the end so that the length of the string totals 8.

I hope this makes sense. My project involves converting a string of characters generated by a COBOL program into a readable table.

CodePudding user response:

Below is a SQL script that will construct and execute dynamic SQL to extract the fields. It defines several SQL template snippets up from and then progressively builds up the final SQL to be executed using a combination of REPLACE() and STRING_AGG() functions

-- Define SQL templates to be used later
DECLARE @SelectItemTemplate VARCHAR(MAX) =
    '    SUBSTRING(PaddedTextString, <Start>, <Length>) AS <ColumnName>'
DECLARE @SelectItemSeparator VARCHAR(20) = ',
'
DECLARE @SqlTemplate VARCHAR(MAX) = '
SELECT T.CardType, T.CardCode,
<Selectlist>
FROM Table1 T
CROSS APPLY (
    SELECT ISNULL(TextString, '''')   SPACE(<PadLength>) AS PaddedTextString
) PTS
'

-- Construct select list to map each substring to a column.
-- Exckude "fill" columns.
DECLARE @SelectList VARCHAR(MAX) = (
    SELECT STRING_AGG(SI.SelectItem, @SelectItemSeparator)
               WITHIN GROUP(ORDER BY C.Seq)
    FROM (
        SELECT *,
            SUM(C.MaxLengthOfString) OVER(ORDER BY C.Seq) - MaxLengthOfString   1 AS StartPos
        FROM ColumnConfig C
    ) C
    CROSS APPLY (
        SELECT REPLACE(REPLACE(REPLACE(
            @SelectItemTemplate
            , '<Start>', CAST(C.StartPos AS VARCHAR))
            , '<Length>', CAST(C.MaxLengthOfString AS VARCHAR))
            , '<ColumnName>', QUOTENAME(C.ColumnName))
          AS SelectItem
    ) SI
    WHERE C.ColumnName NOT LIKE 'Fill%'
)

-- Calculate length needed for padding to ensure all SUBSTRINGs are covered.
DECLARE @PadLength INT = (SELECT SUM(MaxLengthOfString) FROM ColumnConfig)

-- Build final SQL
DECLARE @Sql NVARCHAR(MAX) = REPLACE(REPLACE(
    @SqlTemplate
    , '<SelectList>', @SelectList)
    , '<PadLength>', @PadLength)


-- Print and execute
PRINT @Sql
EXEC (@Sql)

Generated dynamic SQL:

SELECT T.CardType, T.CardCode,
    SUBSTRING(PaddedTextString, 1, 1) AS [Type],
    SUBSTRING(PaddedTextString, 2, 4) AS [Class],
    SUBSTRING(PaddedTextString, 6, 8) AS [PackSize],
    SUBSTRING(PaddedTextString, 14, 3) AS [InvPack],
    SUBSTRING(PaddedTextString, 18, 8) AS [CalcAmount],
    SUBSTRING(PaddedTextString, 26, 1) AS [CostCode],
    SUBSTRING(PaddedTextString, 27, 1) AS [UomCode],
    SUBSTRING(PaddedTextString, 28, 5) AS [Uom],
    SUBSTRING(PaddedTextString, 34, 1) AS [FoodStamp],
    SUBSTRING(PaddedTextString, 36, 1) AS [Tax],
    SUBSTRING(PaddedTextString, 38, 1) AS [Disc],
    SUBSTRING(PaddedTextString, 40, 2) AS [NoStock],
    SUBSTRING(PaddedTextString, 43, 7) AS [CalcPack2],
    SUBSTRING(PaddedTextString, 50, 1) AS [InvFlag]
FROM Table1 T
CROSS APPLY (
    SELECT ISNULL(TextString, '')   SPACE(150) AS PaddedTextString
) PTS

Results:

CardType CardCode Type Class PackSize InvPack CalcAmount CostCode UomCode Uom FoodStamp Tax Disc NoStock CalcPack2 InvFlag
A 2 I 101G 2.2 OZ 001 0002200 L       Y    0010000

For the purposes of this exercise, I split your original posted table into two tables - one for the column configuration and one for the data. I also added a sequence number to the column config to ensure a well-defined order.

A running SUM() of the lengths is used to calculate the start of each field. A total length is calculated and used to pad the input TextString to ensure that all generated SUBSTRING() have sufficient data.

Note that the results will have trailing spaces. typically you would want to trim these spaces before storing them in the database. If you chose to do so, you can add an RTRIM() to the select item template. If you need to reconstruct padded values to transfer back to your COBOL application, you can use something like SELECT LEFT(TrimmedValue SPACE(length), length) as PaddedValue.

I expect that you will still need to modify the final SQL to perform an INSERT into your target table.

See this db<>fiddle for a working select.

  • Related