Oracle SQL to get series id-CodePudding

I got a table myTable.

select ID, Timestamp, Value from myTable order by Timestamp

returns

ID	Timestamp	Value
42	2002-03-03T00:34:42	A
53	2002-03-03T00:35:46	B
21	2002-03-03T00:37:02	A
33	2002-03-03T03:12:56	C
71	2002-03-03T10:31:14	B
16	2002-03-03T17:19:21	A

I need to divide the rows into series starting with value A. Can anyone write an SQL for a column Series_ID returning the ID of the first rows in each series.

ID	Series_ID
42	42
53	42
21	21
33	21
71	21
16	16

CodePudding user response：

You can identify the rows that mark the beginning of each group and then just join the table with itself.

For example:

with
g as (
  select
    t.*, 
    case when val = 'A' then 1 else 0 end as mk,
    sum(case when val = 'A' then 1 else 0 end) over(order by ts) as grp
  from t
)
select a.*, b.id as series_id
from g a
join g b on b.grp = a.grp and b.mk = 1;

Result:

 ID  TS         VAL  MK  GRP  SERIES_ID 
 --- ---------- ---- --- ---- --------- 
 42  03-MAR-02  A    1   1    42        
 53  03-MAR-02  B    0   1    42        
 21  03-MAR-02  A    1   2    21        
 33  03-MAR-02  C    0   2    21        
 71  03-MAR-02  B    0   2    21        
 16  03-MAR-02  A    1   3    16

See running example at db<>fiddle.

CodePudding user response：

You may use ignore nulls feature of lag function to retrieve ID of previous 'A' only:

with a(ID, ts, value) as (
select 42, timestamp '2002-03-03 00:34:42', 'A' from dual union all
select 53, timestamp '2002-03-03 00:35:46', 'B' from dual union all
select 21, timestamp '2002-03-03 00:37:02', 'A' from dual union all
select 33, timestamp '2002-03-03 03:12:56', 'C' from dual union all
select 71, timestamp '2002-03-03 10:31:14', 'B' from dual union all
select 16, timestamp '2002-03-03 17:19:21', 'A' from dual
)
select
  a.* 
  , decode(
    /*Get ID of A*/
    value, 'A', id,
    /*else ID of previous A*/
    lag(decode(value, 'A', id, null))
      ignore nulls
      over(order by ts asc)
  ) as series
from a

ID | TS                           | VALUE | SERIES
-: | :--------------------------- | :---- | -----:
42 | 03-MAR-02 00.34.42.000000000 | A     |     42
53 | 03-MAR-02 00.35.46.000000000 | B     |     42
21 | 03-MAR-02 00.37.02.000000000 | A     |     21
33 | 03-MAR-02 03.12.56.000000000 | C     |     21
71 | 03-MAR-02 10.31.14.000000000 | B     |     21
16 | 03-MAR-02 17.19.21.000000000 | A     |     16

db<>fiddle here

Or similarly, but a bit more readable with match_recognize:

select *
from a
match_recognize (
  order by ts
  measures
    /*ID of the row with value = A
     for all the rows within a match group
    */
    max(a.id) as series
  all rows per match
  pattern (a b*)
  define
    a as value = 'A',
    b as lnnvl(value = 'A')
)

TS                           | SERIES | ID | VALUE
:--------------------------- | -----: | -: | :----
03-MAR-02 00.34.42.000000000 |     42 | 42 | A    
03-MAR-02 00.35.46.000000000 |     42 | 53 | B    
03-MAR-02 00.37.02.000000000 |     21 | 21 | A    
03-MAR-02 03.12.56.000000000 |     21 | 33 | C    
03-MAR-02 10.31.14.000000000 |     21 | 71 | B    
03-MAR-02 17.19.21.000000000 |     16 | 16 | A

db<>fiddle here