I have a table like this.
Date | Enddate |
---|---|
20012022 | 21012022 |
21012022 | 23012022 |
23012022 | 24012022 |
20012022 | 26012022 |
26012022 | 27012022 |
27012022 | 27012022 |
The next date entry is equal to the last one enddate. How do I find lines that don't follow this rule? In the example, line 4 (previus enddate 24012022 - next date 20012022).
I tried use
lag()
I can't understand how it works... Thanks for helping..
CodePudding user response:
Here's one option.
Sample data:
SQL> with test (datum, enddatum) as
2 (select date '2022-01-20', date '2022-01-21' from dual union all
3 select date '2022-01-21', date '2022-01-23' from dual union all
4 select date '2022-01-23', date '2022-01-24' from dual union all
5 select date '2022-01-20', date '2022-12-26' from dual union all
6 select date '2022-12-26', date '2022-12-27' from dual union all
7 select date '2022-12-27', date '2022-12-27' from dual
8 ),
Query begins here: find previous enddatum
so that you could compare it to datum
(line #17):
9 temp as
10 (select datum,
11 enddatum,
12 lag(enddatum) over (order by enddatum) previous_enddatum
13 from test
14 )
15 select datum, enddatum
16 from temp
17 where datum <> previous_enddatum;
DATUM ENDDATUM
---------- ----------
20.01.2022 26.12.2022
SQL>
CodePudding user response:
The LAG() function's result depends on query partition clause and order by clause. Here are two codes giving different results if ordered by Start or End date:
Your sample data:
WITH
tbl (START_DATE, END_DATE) as
( Select DATE '2022-01-20', DATE '2022-01-21' From dual Union All
Select DATE '2022-01-21', DATE '2022-01-23' From dual Union All
Select DATE '2022-01-23', DATE '2022-01-24' From dual Union All
Select DATE '2022-01-20', DATE '2022-12-26' From dual Union All
Select DATE '2022-12-26', DATE '2022-12-27' From dual Union All
Select DATE '2022-12-27', DATE '2022-12-27' From dual
)
- Using Order By END_DATE:
Select START_DATE, END_DATE,
CASE
WHEN START_DATE != LAG(END_DATE) OVER(ORDER BY END_DATE)
THEN 'Should be ' || LAG(END_DATE) OVER(ORDER BY END_DATE)
END "END_DATE_CHECK"
From tbl
START_DATE END_DATE END_DATE_CHECK
---------- --------- -------------------
20-JAN-22 21-JAN-22
21-JAN-22 23-JAN-22
23-JAN-22 24-JAN-22
20-JAN-22 26-DEC-22 Should be 24-JAN-22
26-DEC-22 27-DEC-22
27-DEC-22 27-DEC-22
- Using Order By START_DATE
Select START_DATE, END_DATE,
CASE
WHEN START_DATE != LAG(END_DATE) OVER(ORDER BY START_DATE)
THEN 'Should be ' || LAG(END_DATE) OVER(ORDER BY START_DATE)
END "END_DATE_CHECK"
From tbl
START_DATE END_DATE END_DATE_CHECK
---------- --------- -------------------
20-JAN-22 21-JAN-22
20-JAN-22 26-DEC-22 Should be 21-JAN-22
21-JAN-22 23-JAN-22 Should be 26-DEC-22
23-JAN-22 24-JAN-22
26-DEC-22 27-DEC-22 Should be 24-JAN-22
27-DEC-22 27-DEC-22
It looks like there is something missing in your sample data (some ID column maybe). Let's say that there is some column the dates belong to and that we could partition the dates by that column like below. There is no checking problems at all:
3. Using Partition By
WITH
tbl (ID, START_DATE, END_DATE) as
( Select 1, DATE '2022-01-20', DATE '2022-01-21' From dual Union All
Select 1, DATE '2022-01-21', DATE '2022-01-23' From dual Union All
Select 1, DATE '2022-01-23', DATE '2022-01-24' From dual Union All
Select 2, DATE '2022-01-20', DATE '2022-12-26' From dual Union All
Select 2, DATE '2022-12-26', DATE '2022-12-27' From dual Union All
Select 2, DATE '2022-12-27', DATE '2022-12-27' From dual
)
Select ID, START_DATE, END_DATE,
CASE
WHEN START_DATE != LAG(END_DATE) OVER(Partition By ID ORDER BY START_DATE)
THEN 'Should be ' || LAG(END_DATE) OVER(Partition By ID ORDER BY START_DATE)
END "END_DATE_CHECK"
From tbl
ID START_DATE END_DATE END_DATE_CHECK
---------- ---------- --------- -------------------
1 20-JAN-22 21-JAN-22
1 21-JAN-22 23-JAN-22
1 23-JAN-22 24-JAN-22
2 20-JAN-22 26-DEC-22
2 26-DEC-22 27-DEC-22
2 27-DEC-22 27-DEC-22
In this case there is no difference using Start or End date ordering... More about LAG() OVER() here.