Home > Back-end >  Need to remove rows that exist before flag value
Need to remove rows that exist before flag value

Time:07-06

I am looking to remove the rows (highlighted in yellow) that exist prior to the FirstOrder flag (partitioned by CustomerID).

enter image description here

The business case is that I need to remove the rows for each customer that has a row before their First Order date (ASOFDATE where FirstOrder=1).

I have attempted the creation of another column that coalesces the minimum date per customer and the minimum date per FirstOrder, to create a kind of range, but where I was stymied was how to account for the NULL's in between those dates. The one idea I had was to make those in between dates the same as the first date, then I could simply remove rows where this proposed "reference date" < (asofdate where firstorder = 1), but I'm not having luck getting that to work:

enter image description here

I'm sure the solution is pretty simple and I'm just not seeing it. Here's the sample data:

CREATE TABLE #DATA (ASOFDATE DATE, CUSTOMERID INT, ACTIVE BIT, FIRSTORDER BIT)
INSERT INTO #DATA VALUES
('2018-01-31',  206424, NULL,   NULL),
('2018-02-28',  206424, NULL,   NULL),
('2018-03-31',  206424, 1,      1),
('2022-06-30',  206424, NULL,   NULL),
('2022-07-31',  206424, NULL,   NULL),
('2018-06-30',  247034, NULL,   NULL),
('2018-07-31',  247034, NULL,   NULL),
('2018-08-31',  247034, 1,  1),
('2022-05-31',  247034, NULL,   NULL)

CodePudding user response:

Using a conditional sum() over() may help

Example

;with cte as (
Select * 
      ,Flag = sum( case when FirstOrder=1 then 1 else 0 end ) over (partition by CustomerID order by asofdate)
 from #Data
)
Delete from cte where Flag=0

Updated Table

ASOFDATE    CUSTOMERID  ACTIVE  FIRSTORDER
2018-03-31  206424      1       1
2022-06-30  206424      NULL    NULL
2022-07-31  206424      NULL    NULL
2018-08-31  247034      1       1
2022-05-31  247034      NULL    NULL

CodePudding user response:

Yet another working option is matching your table with a selection of your table where "FIRSTORDER" is true. Then remove all "CUSTOMERID" matching rows which have a lower "ASOFDATE" than the corresponding "FIRSTORDER".

DELETE #DATA
FROM       #DATA 
INNER JOIN (SELECT ASOFDATE, CUSTOMERID FROM #DATA WHERE FIRSTORDER = 'True') firstOrders
        ON #DATA.CUSTOMERID = firstOrders.CUSTOMERID
       AND #DATA.ASOFDATE   < firstOrders.ASOFDATE;

Check the demo here.

CodePudding user response:

Try:

WITH StartDates AS (
    SELECT CUSTOMERID, MIN(ASOFDATE) AS StartDate
    FROM #DATA
    WHERE FIRSTORDER = 1
    GROUP BY CUSTOMERID
)
DELETE D
FROM StartDates S
JOIN #DATA D
    ON D.CUSTOMERID = S.CUSTOMERID
    AND D.ASOFDATE < S.StartDate

See This db<>fiddle

The CTE (common table expression) first picks the FIRSTORDER reference date for each customer, and the main query then uses that information to select the records to be deleted.

  • Related