Consider I have a table like this
PASSENGER CITY DATE
43 NEW YORK 1-Jan-21
44 CHICAGO 4-Jan-21
43 NEW YORK 2-Jan-21
43 NEW YORK 3-Jan-21
44 ROME 5-Jan-21
43 LONDON 4-Jan-21
44 CHICAGO 6-Jan-21
44 CHICAGO 7-Jan-21
How would I group Passenger and City column in sequence to get a result like below?
PASSENGER CITY COUNT
43 NEW YORK 3
44 CHICAGO 1
44 ROME 1
43 LONDON 1
44 CHICAGO 2
CodePudding user response:
From Oracle 12, you can use MATCH_RECOGNIZE
:
SELECT *
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY passenger
ORDER BY "DATE"
MEASURES
FIRST(city) AS city,
COUNT(*) AS count
PATTERN (same_city )
DEFINE
same_city AS FIRST(city) = city
);
Which, for the sample data:
CREATE TABLE table_name (PASSENGER, CITY, "DATE") AS
SELECT 43, 'NEW YORK', DATE '2021-01-01' FROM DUAL UNION ALL
SELECT 44, 'CHICAGO', DATE '2021-01-04' FROM DUAL UNION ALL
SELECT 43, 'NEW YORK', DATE '2021-01-02' FROM DUAL UNION ALL
SELECT 43, 'NEW YORK', DATE '2021-01-03' FROM DUAL UNION ALL
SELECT 44, 'ROME', DATE '2021-01-05' FROM DUAL UNION ALL
SELECT 43, 'LONDON', DATE '2021-01-04' FROM DUAL UNION ALL
SELECT 44, 'CHICAGO', DATE '2021-01-06' FROM DUAL UNION ALL
SELECT 44, 'CHICAGO', DATE '2021-01-07' FROM DUAL
Outputs:
PASSENGER CITY COUNT 43 NEW YORK 3 43 LONDON 1 44 CHICAGO 1 44 ROME 1 44 CHICAGO 2
If you have ordered the input result set (note: tables should be considered to be unordered) and want to maintain the order then:
SELECT *
FROM (SELECT t.*, ROWNUM AS rn FROM table_name t)
MATCH_RECOGNIZE (
PARTITION BY passenger
ORDER BY RN
MEASURES
FIRST(rn) AS rn,
FIRST("DATE") AS "DATE",
FIRST(city) AS city,
COUNT(*) AS count
PATTERN (same_city )
DEFINE
same_city AS FIRST(city) = city
)
ORDER BY rn
Outputs:
PASSENGER RN DATE CITY COUNT 43 1 01-JAN-21 NEW YORK 3 44 2 04-JAN-21 CHICAGO 1 44 5 05-JAN-21 ROME 1 43 6 04-JAN-21 LONDON 1 44 7 06-JAN-21 CHICAGO 2
db<>fiddle here
CodePudding user response:
One way to deal with such a gaps-and-islands problem is to calculate a ranking for the gaps.
Then group also on that ranking.
SELECT PASSENGER, CITY
, COUNT(*) AS "Count"
-- , MIN("DATE") AS StartDate
-- , MAX("DATE") AS EndDate
FROM (
SELECT q1.*
, SUM(gap) OVER (PARTITION BY PASSENGER ORDER BY "DATE") as Rnk
FROM (
SELECT PASSENGER, CITY, "DATE"
, CASE
WHEN 1 = TRUNC("DATE")
- TRUNC(LAG("DATE")
OVER (PARTITION BY PASSENGER, CITY ORDER BY "DATE"))
THEN 0 ELSE 1 END as gap
FROM table_name t
) q1
) q2
GROUP BY PASSENGER, CITY, Rnk
ORDER BY MIN("DATE"), PASSENGER
PASSENGER | CITY | Count |
---|---|---|
43 | NEW YORK | 3 |
43 | LONDON | 1 |
44 | CHICAGO | 1 |
44 | ROME | 1 |
44 | CHICAGO | 2 |
db<>fiddle here