We have a list with a sequence number. The sequence will break, then begin again.
As you can see below, the SalesOrderLine
is missing the number 4.
SalesOrder SalesOrderLine MStockCode MPrice MBomFlag
000000000182181 1 901337 0.00000 P
000000000182181 2 901335 2476.90000 C
000000000182181 3 340151 0.00000 C
000000000182181 5 900894 0.00000 P
000000000182181 6 400379 0.00000 C
000000000182181 7 900570 600.90000 C
What I'm looking to do is summarize the MPrice
field by a consecutive number sequence, then use the MBomFlag
field to pick the "parent".
This would be the expected result of the above data. Any grouping will be done on the MBomFlag
field if the value = P
SalesOrder SalesOrderLine MStockCode MPrice MBomFlag
000000000182181 1 901337 2476.90000 P
000000000182181 5 900894 600.90000 P
What would be the best way to attack this? I'm trying to figure out something using RANK()
, ROW_NUMBER()
, LEAD
, and LAG
but not having much success
Here is the source data:
CREATE TABLE #SalesOrder (
SalesOrder NVARCHAR(20),
SalesOrderLine INT,
MStockCode INT,
MPrice DECIMAL(18,2),
MBomFlag VARCHAR(1))
INSERT INTO #SalesOrder (SalesOrder, SalesOrderLine, MStockCode, MPrice, MBomFlag)
SELECT '000000000182181', 1, '901337', 0.00000, 'P'
UNION
SELECT '000000000182181', 2, '901335', 2476.90000, 'C'
UNION
SELECT '000000000182181', 3, '340151', 0.00000, 'C'
UNION
SELECT '000000000182181', 5, '900894', 0.00000, 'P'
UNION
SELECT '000000000182181', 6, '400379', 0.00000, 'C'
UNION
SELECT '000000000182181', 7, '900570', 2600.90000, 'C'
SELECT *
FROM #SalesOrder
DROP TABLE #SalesOrder
CodePudding user response:
You can use lag()
to detect when there is a jump in the sequence of SalesOrderLine, and keep a running total of the number of times there is a jump. Rows with the same number of running total jumps belong to the same group.
with u as
(select *, lag(SalesOrderLine) over (order by SalesOrderLine) as
previousSOL
from #SalesOrder),
v as
(select *, sum(case when SalesOrderLine = PreviousSOL 1 then 0 else 1
end)
over (order by SalesOrderLine rows unbounded preceding) as jumps
from u
)
select min(case when MBomFlag = 'P' then SalesOrder end) as SalesOrder,
min(case when MBomFlag = 'P' then SalesOrderLine end) as SalesOrderLine,
min(case when MBomFlag = 'P' then MStockCode end) as MStockCode,
sum(MPrice) as Mprice,
'P' as MBomFlag
from v
group by jumps;
CodePudding user response:
Please try the following solution.
SQL
-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY(1,1) PRIMARY KEY,
SalesOrder NVARCHAR(20),
SalesOrderLine INT,
MStockCode INT,
MPrice DECIMAL(18,2),
MBomFlag VARCHAR(1))
INSERT INTO @tbl (SalesOrder, SalesOrderLine, MStockCode, MPrice, MBomFlag) VALUES
('000000000182181', 1, '901337', 0.00000, 'P'),
('000000000182181', 2, '901335', 2476.90000, 'C'),
('000000000182181', 3, '340151', 0.00000, 'C'),
('000000000182181', 5, '900894', 0.00000, 'P'),
('000000000182181', 6, '400379', 0.00000, 'C'),
('000000000182181', 7, '900570', 600.90000, 'C');
-- DDL and sample data population, end
WITH rs AS
(
SELECT series.*,
ns = SalesOrderLine - id
FROM @tbl AS series
), cte AS
(
SELECT *
, FIRST_VALUE(MStockCode) OVER(PARTITION BY ns ORDER BY id) AS _MStockCode
, FIRST_VALUE(MBomFlag) OVER(PARTITION BY ns ORDER BY id) AS _MBomFlag
FROM rs
)
SELECT MIN(cte.SalesOrder) AS SalesOrder
, MIN(cte.SalesOrderLine) AS SalesOrderLine
, MIN(cte._MStockCode) AS MStockCode
, SUM(cte.MPrice) AS MPrice
, MIN(cte._MBomFlag) AS MBomFlag
FROM cte
GROUP BY ns;
Output
----------------- ---------------- ------------ --------- ----------
| SalesOrder | SalesOrderLine | MStockCode | MPrice | MBomFlag |
----------------- ---------------- ------------ --------- ----------
| 000000000182181 | 1 | 901337 | 2476.90 | P |
| 000000000182181 | 5 | 900894 | 600.90 | P |
----------------- ---------------- ------------ --------- ----------
CodePudding user response:
I assume this happens on a per SalesOrder
basis. This is standard gaps and islands. There's no need for recursive queries or temp tables:
with data as (
select *, SalesOrderLine - row_number()
over (partition by SalesOrder order by SalesOrderLine) as grp
from #SalesOrder
)
select
SalesOrderLine,
min(SalesOrderLine) as SalesOrderLine,
min(case when MBomFlag = 'P' then MStockCode end) as MStockCode,
min(case when MBomFlag = 'P' then MPrice end) as MPrice,
/* doesn't assume there will be a parent -- also retains column datatype */
min(case when MBomFlag = 'P' then MBomFlag end) as MBomFlag
from data
group by SalesOrder, grp
CodePudding user response:
This is a classic gaps-and-islands problem.
However, in this case the start of each island is clearly delineated by a P
(or a row that is not C
). So we don't need LAG
for that.
We just need to assign a grouping ID for each island, which we can do using a windowed conditional COUNT
. Then we simply group by that ID.
SELECT
pv.SalesOrder,
SalesOrderLine = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.SalesOrderLine END),
MStockCode = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.MStockCode END),
MPrice = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.MPrice END),
MBomFlag = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.MBomFlag END)
FROM (
SELECT *,
GroupingId = COUNT(NULLIF(t.MBomFlag, 'C')) OVER (PARTITION BY t.SalesOrder ORDER BY t.SalesOrderLine ROWS UNBOUNDED PRECEDING)
FROM @tbl t
) pv
GROUP BY
pv.SalesOrder,
pv.GroupingId;
Note that NULLIF(t.MBomFlag, 'C')
returns null if the flag is C
, so COUNT
will only count the other rows. You could also write that explicitly using COUNT(CASE WHEN t.MBomFlag = 'P' THEN 1 END)