SUM Values by Sequence Number and Group By Flag-CodePudding

We have a list with a sequence number. The sequence will break, then begin again.

As you can see below, the SalesOrderLine is missing the number 4.

SalesOrder      SalesOrderLine  MStockCode  MPrice      MBomFlag
000000000182181 1               901337      0.00000     P
000000000182181 2               901335      2476.90000  C
000000000182181 3               340151      0.00000     C
000000000182181 5               900894      0.00000     P
000000000182181 6               400379      0.00000     C
000000000182181 7               900570      600.90000   C

What I'm looking to do is summarize the MPrice field by a consecutive number sequence, then use the MBomFlag field to pick the "parent".

This would be the expected result of the above data. Any grouping will be done on the MBomFlag field if the value = P

SalesOrder      SalesOrderLine  MStockCode  MPrice      MBomFlag
000000000182181 1               901337      2476.90000  P
000000000182181 5               900894      600.90000   P

What would be the best way to attack this? I'm trying to figure out something using RANK(), ROW_NUMBER(), LEAD, and LAG but not having much success

Here is the source data:

CREATE TABLE #SalesOrder (
SalesOrder NVARCHAR(20),
SalesOrderLine INT,
MStockCode INT,
MPrice DECIMAL(18,2),
MBomFlag VARCHAR(1))

INSERT INTO #SalesOrder (SalesOrder, SalesOrderLine, MStockCode, MPrice, MBomFlag)
SELECT '000000000182181',   1,  '901337',   0.00000, 'P'
UNION
SELECT '000000000182181',   2,  '901335', 2476.90000, 'C'
UNION
SELECT '000000000182181',   3,  '340151',   0.00000,  'C'
UNION
SELECT '000000000182181',   5,  '900894',   0.00000,  'P'
UNION
SELECT '000000000182181',   6,  '400379',   0.00000,  'C'
UNION
SELECT '000000000182181',   7,  '900570',   2600.90000, 'C'

SELECT *
  FROM #SalesOrder

DROP TABLE #SalesOrder

CodePudding user response：

You can use lag() to detect when there is a jump in the sequence of SalesOrderLine, and keep a running total of the number of times there is a jump. Rows with the same number of running total jumps belong to the same group.

with u as
(select *, lag(SalesOrderLine) over (order by SalesOrderLine) as 
previousSOL
from #SalesOrder),
v as 
(select *, sum(case when SalesOrderLine = PreviousSOL  1 then 0 else 1 
end)
over (order by SalesOrderLine rows unbounded preceding) as jumps
from u
)
select min(case when MBomFlag = 'P' then SalesOrder end) as SalesOrder,
min(case when MBomFlag = 'P' then SalesOrderLine end) as SalesOrderLine,
min(case when MBomFlag = 'P' then MStockCode end) as MStockCode,
sum(MPrice) as Mprice,
'P' as MBomFlag
from v
group by jumps;

Fiddle

CodePudding user response：

Please try the following solution.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE  (ID INT IDENTITY(1,1) PRIMARY KEY,
SalesOrder NVARCHAR(20),
SalesOrderLine INT,
MStockCode INT,
MPrice DECIMAL(18,2),
MBomFlag VARCHAR(1))

INSERT INTO @tbl (SalesOrder, SalesOrderLine, MStockCode, MPrice, MBomFlag) VALUES
('000000000182181', 1, '901337',   0.00000, 'P'),
('000000000182181', 2, '901335', 2476.90000, 'C'),
('000000000182181', 3, '340151',   0.00000,  'C'),
('000000000182181', 5, '900894',   0.00000,  'P'),
('000000000182181', 6, '400379',   0.00000,  'C'),
('000000000182181', 7, '900570',   600.90000, 'C');
-- DDL and sample data population, end

WITH rs AS
(
    SELECT series.*,
           ns = SalesOrderLine - id
    FROM @tbl AS series
), cte AS
(
    SELECT * 
        , FIRST_VALUE(MStockCode) OVER(PARTITION BY ns ORDER BY id) AS _MStockCode
        , FIRST_VALUE(MBomFlag) OVER(PARTITION BY ns ORDER BY id) AS _MBomFlag
    FROM rs
)
SELECT MIN(cte.SalesOrder) AS SalesOrder
    , MIN(cte.SalesOrderLine) AS SalesOrderLine
    , MIN(cte._MStockCode) AS MStockCode
    , SUM(cte.MPrice) AS MPrice
    , MIN(cte._MBomFlag) AS MBomFlag
FROM cte
GROUP BY ns;

Output

 ----------------- ---------------- ------------ --------- ---------- 
|   SalesOrder    | SalesOrderLine | MStockCode | MPrice  | MBomFlag |
 ----------------- ---------------- ------------ --------- ---------- 
| 000000000182181 |              1 |     901337 | 2476.90 | P        |
| 000000000182181 |              5 |     900894 |  600.90 | P        |
 ----------------- ---------------- ------------ --------- ----------

CodePudding user response：

I assume this happens on a per SalesOrder basis. This is standard gaps and islands. There's no need for recursive queries or temp tables:

with data as (
    select *, SalesOrderLine - row_number()
        over (partition by SalesOrder order by SalesOrderLine) as grp
    from #SalesOrder
)
select
    SalesOrderLine,
    min(SalesOrderLine) as SalesOrderLine,
    min(case when MBomFlag = 'P' then MStockCode end) as MStockCode,
    min(case when MBomFlag = 'P' then MPrice end) as MPrice,
    /* doesn't assume there will be a parent -- also retains column datatype */
    min(case when MBomFlag = 'P' then MBomFlag end) as MBomFlag
from data
group by SalesOrder, grp

CodePudding user response：

This is a classic gaps-and-islands problem.

However, in this case the start of each island is clearly delineated by a P (or a row that is not C). So we don't need LAG for that.

We just need to assign a grouping ID for each island, which we can do using a windowed conditional COUNT. Then we simply group by that ID.

SELECT
  pv.SalesOrder,
  SalesOrderLine = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.SalesOrderLine END),
  MStockCode = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.MStockCode END),
  MPrice = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.MPrice END),
  MBomFlag = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.MBomFlag END)
FROM (
    SELECT *,
      GroupingId = COUNT(NULLIF(t.MBomFlag, 'C')) OVER (PARTITION BY t.SalesOrder ORDER BY t.SalesOrderLine ROWS UNBOUNDED PRECEDING)
    FROM @tbl t
) pv
GROUP BY
  pv.SalesOrder,
  pv.GroupingId;

Note that NULLIF(t.MBomFlag, 'C') returns null if the flag is C, so COUNT will only count the other rows. You could also write that explicitly using COUNT(CASE WHEN t.MBomFlag = 'P' THEN 1 END)

db<>fiddle