Home > other >  SQL - applying multiple pivots in one query, and finding the sum of resulting rows
SQL - applying multiple pivots in one query, and finding the sum of resulting rows

Time:12-03

I am attempting to create an SQL query which will compress multidimentional data using PIVOT, but need to do some operations on the pivoted data.

I have coding experience but very little experience with SQL and am using SQLPathfinder3 to create and execute queries, so what I am looking for is top level approach - what order of operations should I be going for here to achieve the desired result?

An example of my initial result table, gotten with SELECT, is as follows:

Item Property Subproperty Thing
item1 propertyA subproperty1 value1_A_1
item1 propertyA subproperty2 value1_A_2
item1 propertyB subproperty1 value1_B_1
item1 propertyB subproperty2 value1_B_2
item2 propertyA subproperty1 value2_A_1
item2 propertyA subproperty2 value2_A_2
item2 propertyB subproperty1 value2_B_1
item2 propertyB subproperty2 value2_B_2

By executing one pivot, assigning Subproperty as the CTHEADER, and Thing as the CTVALUE, I can create this:

Item Property Subproperty1 Subproperty2
item1 propertyA value1_A_1 value1_A_2
item1 propertyB value1_B_1 value1_B_2
item2 propertyA value2_A_1 value2_A_2
item2 propertyB value2_B_1 value2_B_2

Similarly, I can pivot around "Subproperty" to get something analogous:

Item Subproperty Property A Property B
item1 subproperty1 value1_A_1 value1_A_2
item1 subproperty2 value1_B_1 value1_B_2
item2 subproperty1 value2_A_1 value2_A_2
item2 subproperty2 value2_B_1 value2_B_2

The final result that I need is this - a combination of the above values:

Item Property A Property B
item1 value1_A_1 value1_A_2 value1_B_1 value1_B_2
item2 value2_A_1 value2_A_2 value2_B_1 value2_B_2

I don't know how to go about doing this in a single query, but I would very much like to, because this is something that needs to be automated.

I can't simply SELECT the sums that I want, because those values are on four separate lines. I could workaround in excel or python, especially if I can effect the result below, but I'm not sure how to do this.

Item PropA Sub1 PropA Sub2 PropB Sub1 PropB Sub2
item1 value1_A_1 value1_A_2 value1_B_1 value1_B_2
item2 value2_A_1 value2_A_2 value2_B_1 value2_B_2

Any help, or pointers towards educational material would be appreciated, thanks.

CodePudding user response:

Try:

SELECT
    P.Item,
    STRING_AGG(P.propertyA, '   ') WITHIN GROUP (ORDER BY Subproperty) AS propertyA,
    STRING_AGG(P.propertyB, '   ') WITHIN GROUP (ORDER BY Subproperty) AS propertyB
FROM @Data
PIVOT (
    MAX(Thing)
    FOR Property IN (propertyA, propertyB)
) P
GROUP BY P.Item
ORDER BY P.Item

See this db<>fiddle.

The initial PIVOT will yield:

Item Subproperty propertyA propertyB
item1 subproperty1 value1_A_1 value1_B_1
item2 subproperty1 value2_A_1 value2_B_1
item1 subproperty2 value1_A_2 value1_B_2
item2 subproperty2 value2_A_2 value2_B_2

and the GROUP BY and STRING_AGG() will yield the desired result:

Item propertyA propertyB
item1 value1_A_1 value1_A_2 value1_B_1 value1_B_2
item2 value2_A_1 value2_A_2 value2_B_1 value2_B_2

The WITHIN GROUP(...) clause is not actually required, but guarantees consistent results.

ADDENDUM: For the case where a sum of numbers is need, the STRING_AGG() WITHIN GROUP() can be replaced with a simple SUM(). Using SUM() instead of MAX() in the pivot is also recommended.

-- Pivot plus group by aggregation (Sum of numeric values)
SELECT
    P.Item,
    SUM(P.propertyA) AS propertyA,
    SUM(P.propertyB) AS propertyB
FROM @Data
PIVOT (
    SUM(Thing)
    FOR Property IN (propertyA, propertyB)
) P
GROUP BY P.Item
ORDER BY P.Item

The Query can be further simplified to a plain old PIVOT by eliminating unwanted columns from the initial select that feeds the PIVOT.

-- Using just a regular PIVOT after excluding unwanted columns
SELECT P.Item, P.propertyA, P.propertyB
FROM (SELECT Item, Property, Thing FROM @Data) D
PIVOT (
    SUM(Thing)
    FOR Property IN (propertyA, propertyB)
) P

Lastly (I hope), a technique called "Conditional Aggregation" can be used in place of the PIVOT. Sometimes this is actually easier to read, understand, and maintain.

-- Using conditional aggregation instead of PIVOT
SELECT
    D.Item,
    SUM(CASE WHEN D.Property = 'propertyA' THEN D.Thing END) AS propertyA,
    SUM(CASE WHEN D.Property = 'propertyB' THEN D.Thing END) AS propertyB
FROM @Data D
GROUP BY D.Item
ORDER BY D.Item

See this db<>fiddle for the above techniques.

CodePudding user response:

I would choose a dynamic approach, as property will have morwe than 2 item i guess.

I choose to use integers for thig as you wante4d sums

CREATE TABLE tab1
    ([Item] varchar(5), [Property] varchar(9), [Subproperty] varchar(12), [Thing] int)
;
    
INSERT INTO tab1
    ([Item], [Property], [Subproperty], [Thing])
VALUES
    ('item1', 'propertyA', 'subproperty1', 1),
    ('item1', 'propertyA', 'subproperty2', 2),
    ('item1', 'propertyB', 'subproperty1', 3),
    ('item1', 'propertyB', 'subproperty2', 4),
    ('item2', 'propertyA', 'subproperty1', 5),
    ('item2', 'propertyA', 'subproperty2', 6),
    ('item2', 'propertyB', 'subproperty1', 7),
    ('item2', 'propertyB', 'subproperty2', 8)
;

8 rows affected
DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX);

SET @cols = (SELECT STRING_AGG([Property],',') FROM (SELECT DISTINCT [Property] FROM tab1 WHERE [Property] IS NOT NULL)t);

set @query = '
  
  SELECT [Item], '   @cols   ' from 
            (
                SELECT
                   [Item], [Property], SUM([Thing]) as SumThing
                 FROM tab1
                 GROUP BY [Item], [Property]
           ) x
            pivot 
            (
                 SUM(SumThing)
                for [Property] in ('   @cols   ')
            ) p ';

execute(@query);


Item propertyA propertyB
item1 3 7
item2 11 15

fiddle

  • Related