Let's say we have data as below
EmpId Stages Amount
101 Stg1 10.2
101 Stg2 10.22
101 Stg1 10.11
101 Stg3 6.21
101 Stg2 3.22
102 Stg1 3.23
102 Stg2 2.22
102 Stg3 1.22
102 Stg3 3.22
And the result required: Each employee wise stages count and stage wise sum amount as below
EmpId Stg1-Count Amount-stg1 Stg2-Count Amount-stg2 Stg3-Count Amount-stg3
101 2 20.31 2 13.44 1 6.21
102 1 3.23 1 2.22 2 4.44
CodePudding user response:
Here is an approach that uses the straightforward syntax of conditional aggregations.
/* Set up the sample data */
DROP TABLE IF EXISTS #RawData;
CREATE TABLE #RawData
(
EmpId INT,
Stages NVARCHAR(5),
Amount DECIMAL(9, 4)
);
INSERT INTO #RawData
VALUES (101, N'Stg1', 10.2),
(101, N'Stg2', 10.22),
(101, N'Stg1', 10.11),
(101, N'Stg3', 6.21),
(101, N'Stg2', 3.22),
(102, N'Stg1', 3.23),
(102, N'Stg2', 2.22),
(102, N'Stg3', 1.22),
(102, N'Stg3', 3.22),
(103, N'Stg1', 1.55);
I added a row to ensure we're handling incomplete and in-process stage progressions.
This assumes that there are a fixed number of "stages" or categories that will be counted and summed. If there are a large or changing number of values, this can become tedious to set up, but the code is pretty straightforward:
SELECT d.EmpId,
SUM(CASE WHEN d.Stages = 'Stg1' THEN 1 ELSE 0 END) AS [Stg1-Count],
SUM(CASE WHEN d.Stages = 'Stg1' THEN ISNULL(d.Amount, 0) ELSE 0 END) AS [Amount-Stg1],
SUM(CASE WHEN d.Stages = 'Stg2' THEN 1 ELSE 0 END) AS [Stg2-Count],
SUM(CASE WHEN d.Stages = 'Stg2' THEN ISNULL(d.Amount, 0) ELSE 0 END) AS [Amount-Stg2],
SUM(CASE WHEN d.Stages = 'Stg3' THEN 1 ELSE 0 END) AS [Stg3-Count],
SUM(CASE WHEN d.Stages = 'Stg3' THEN ISNULL(d.Amount, 0) ELSE 0 END) AS [Amount-Stg3]
FROM #RawData AS d
GROUP BY d.EmpId;
EmpId | Stg1-Count | Amount-Stg1 | Stg2-Count | Amount-Stg2 | Stg3-Count | Amount-Stg3 |
---|---|---|---|---|---|---|
101 | 2 | 20.3100 | 2 | 13.4400 | 1 | 6.2100 |
102 | 1 | 3.2300 | 1 | 2.2200 | 2 | 4.4400 |
103 | 1 | 1.5500 | 0 | 0.0000 | 0 | 0.0000 |
It uses a GROUP BY
on the EmpId to aggregate everything per employee, and CASE
s are evaluated to construct the additional columns. Using a SUM
rather than a COUNT
provides for a conditional counting, otherwise we'd just get the count of all rows for that employee regardless of the 'Stage' value.
CodePudding user response:
One option for this is to use dynamic SQL and PIVOT
to get a more elastic result set.
Using an adapted version of your result set that includes some new 'Stages' values
/* Set up the sample data */
DROP TABLE IF EXISTS #RawData;
CREATE TABLE #RawData
(
EmpId INT,
Stages NVARCHAR(5),
Amount DECIMAL(9, 4)
);
DECLARE @SqlStatement NVARCHAR(MAX);
INSERT INTO #RawData
VALUES (101, N'Stg1', 10.2),
(101, N'Stg2', 10.22),
(101, N'Stg1', 10.11),
(101, N'Stg3', 6.21),
(101, N'Stg2', 3.22),
(102, N'Stg1', 3.23),
(102, N'Stg2', 2.22),
(102, N'Stg3', 1.22),
(102, N'Stg3', 3.22),
(103, N'Stg0', 4.00),
(103, N'Stg1', 1.55),
(103, N'Stg4', 14.00);
Next, we'll need some constructs for working with those data in dynamic SQL:
/* Set up some variables to store comma-delimited values and column names */
DECLARE @stages NVARCHAR(255),
@amountAlias NVARCHAR(255),
@countAlias NVARCHAR(255),
@resultCols NVARCHAR(255);
/* Store comma-delimited column names and selection aliases in variables for use in the dynamic blocks */
SELECT @stages = STRING_AGG(d.Stages, ','),
@amountAlias = STRING_AGG(CONCAT(N'ISNULL(a.', d.Stages, N', 0) AS [Amount-', d.Stages, N']'), ', '),
@countAlias = STRING_AGG(CONCAT(N'ISNULL(c.', d.Stages, N', 0) AS [', d.Stages, N'-Count]'), ', '),
@resultCols = STRING_AGG(CONCAT(N'c.[', d.Stages, N'-Count], a.[Amount-', d.Stages, N']'), ', ')
FROM (SELECT DISTINCT Stages FROM #RawData) AS d;
Then, it's a matter of constructing the dynamic code using those variables:
SELECT @SqlStatement = N';WITH Amounts AS (SELECT a.EmpId,' @amountAlias
N'
FROM (SELECT EmpId, Stages, ISNULL(Amount, 0) AS Amount FROM #RawData) AS src
PIVOT (SUM(src.Amount)
FOR src.Stages IN (' @stages N') ) AS a),
Counts AS (SELECT c.EmpId,' @countAlias
N'
FROM (SELECT EmpId, Stages, ISNULL(Amount, 0) AS Amount FROM #RawData) AS src
PIVOT (COUNT(src.Amount)
FOR src.Stages IN (' @stages N') ) AS c)
SELECT a.EmpId, ' @resultCols N'
FROM Amounts AS a
JOIN Counts AS c
ON a.EmpId = c.EmpId;';
Finally, you can execute the dynamic SQL using EXEC (@SqlStatement);
In this example, there are two CTEs defined using PIVOT
operations: one for the amount aggregate, and the other for the counts. The column-name strings are constructed using the distinct 'Stages' values, and are aliased with the associated aggregation-type names.
This solution has the advantage of adjusting to and returning new 'Stages' values, but is more complicated and generates a more complex execution plan, more table scans, more logical reads, and at a larger scale will run into cardinality estimation limitations.
| EmpId | Stg0-Count | Amount-Stg0 | Stg1-Count | Amount-Stg1 | Stg2-Count | Amount-Stg2 | Stg3-Count | Amount-Stg3 | Stg4-Count | Amount-Stg4 |
|-------|------------|-------------|------------|-------------|------------|-------------|------------|-------------|------------|-------------|
| 101 | 0 | 0.0000 | 2 | 20.3100 | 2 | 13.4400 | 1 | 6.2100 | 0 | 0.0000 |
| 102 | 0 | 0.0000 | 1 | 3.2300 | 1 | 2.2200 | 2 | 4.4400 | 0 | 0.0000 |
| 103 | 1 | 4.0000 | 1 | 1.5500 | 0 | 0.0000 | 0 | 0.0000 | 1 | 14.0000 |