I got a table myTable
.
select ID, Timestamp, Value from myTable order by Timestamp
returns
ID | Timestamp | Value |
---|---|---|
42 | 2002-03-03T00:34:42 | A |
53 | 2002-03-03T00:35:46 | B |
21 | 2002-03-03T00:37:02 | A |
33 | 2002-03-03T03:12:56 | C |
71 | 2002-03-03T10:31:14 | B |
16 | 2002-03-03T17:19:21 | A |
I need to divide the rows into series starting with value A. Can anyone write an SQL for a column Series_ID
returning the ID of the first rows in each series.
ID | Series_ID |
---|---|
42 | 42 |
53 | 42 |
21 | 21 |
33 | 21 |
71 | 21 |
16 | 16 |
CodePudding user response:
You can identify the rows that mark the beginning of each group and then just join the table with itself.
For example:
with
g as (
select
t.*,
case when val = 'A' then 1 else 0 end as mk,
sum(case when val = 'A' then 1 else 0 end) over(order by ts) as grp
from t
)
select a.*, b.id as series_id
from g a
join g b on b.grp = a.grp and b.mk = 1;
Result:
ID TS VAL MK GRP SERIES_ID
--- ---------- ---- --- ---- ---------
42 03-MAR-02 A 1 1 42
53 03-MAR-02 B 0 1 42
21 03-MAR-02 A 1 2 21
33 03-MAR-02 C 0 2 21
71 03-MAR-02 B 0 2 21
16 03-MAR-02 A 1 3 16
See running example at db<>fiddle.
CodePudding user response:
You may use ignore nulls
feature of lag
function to retrieve ID of previous 'A'
only:
with a(ID, ts, value) as ( select 42, timestamp '2002-03-03 00:34:42', 'A' from dual union all select 53, timestamp '2002-03-03 00:35:46', 'B' from dual union all select 21, timestamp '2002-03-03 00:37:02', 'A' from dual union all select 33, timestamp '2002-03-03 03:12:56', 'C' from dual union all select 71, timestamp '2002-03-03 10:31:14', 'B' from dual union all select 16, timestamp '2002-03-03 17:19:21', 'A' from dual ) select a.* , decode( /*Get ID of A*/ value, 'A', id, /*else ID of previous A*/ lag(decode(value, 'A', id, null)) ignore nulls over(order by ts asc) ) as series from a
ID | TS | VALUE | SERIES -: | :--------------------------- | :---- | -----: 42 | 03-MAR-02 00.34.42.000000000 | A | 42 53 | 03-MAR-02 00.35.46.000000000 | B | 42 21 | 03-MAR-02 00.37.02.000000000 | A | 21 33 | 03-MAR-02 03.12.56.000000000 | C | 21 71 | 03-MAR-02 10.31.14.000000000 | B | 21 16 | 03-MAR-02 17.19.21.000000000 | A | 16
db<>fiddle here
Or similarly, but a bit more readable with match_recognize
:
select * from a match_recognize ( order by ts measures /*ID of the row with value = A for all the rows within a match group */ max(a.id) as series all rows per match pattern (a b*) define a as value = 'A', b as lnnvl(value = 'A') )
TS | SERIES | ID | VALUE :--------------------------- | -----: | -: | :---- 03-MAR-02 00.34.42.000000000 | 42 | 42 | A 03-MAR-02 00.35.46.000000000 | 42 | 53 | B 03-MAR-02 00.37.02.000000000 | 21 | 21 | A 03-MAR-02 03.12.56.000000000 | 21 | 33 | C 03-MAR-02 10.31.14.000000000 | 21 | 71 | B 03-MAR-02 17.19.21.000000000 | 16 | 16 | A
db<>fiddle here