I have this table:
Site_ID | Volume | RPT_Date | RPT_Hour |
---|---|---|---|
1 | 10 | 01/01/2021 | 1 |
1 | 7 | 01/01/2021 | 2 |
1 | 13 | 01/01/2021 | 3 |
1 | 11 | 01/16/2021 | 1 |
1 | 3 | 01/16/2021 | 2 |
1 | 5 | 01/16/2021 | 3 |
2 | 9 | 01/01/2021 | 1 |
2 | 24 | 01/01/2021 | 2 |
2 | 16 | 01/01/2021 | 3 |
2 | 18 | 01/16/2021 | 1 |
2 | 7 | 01/16/2021 | 2 |
2 | 1 | 01/16/2021 | 3 |
I need to select the RPT_Hour with the highest Volume for each set of dates
Needed Output:
Site_ID | Volume | RPT_Date | RPT_Hour |
---|---|---|---|
1 | 13 | 01/01/2021 | 1 |
1 | 11 | 01/16/2021 | 1 |
2 | 24 | 01/01/2021 | 2 |
2 | 18 | 01/16/2021 | 1 |
SELECT site_id, volume, rpt_date, rpt_hour
FROM (SELECT t.*,
ROW_NUMBER()
OVER (PARTITION BY site_id, rpt_date ORDER BY volume DESC) AS rn
FROM MyTable) t
WHERE rn = 1;
I cannot figure out how to group the table into like date groups. If I could do that, I think the rn = 1 will return the highest volume row for each date.
CodePudding user response:
The way I see it, your query is OK (but rpt_hour
in desired output is not).
SQL> with test (site_id, volume, rpt_date, rpt_hour) as
2 (select 1, 10, date '2021-01-01', 1 from dual union all
3 select 1, 7, date '2021-01-01', 2 from dual union all
4 select 1, 13, date '2021-01-01', 3 from dual union all
5 select 1, 11, date '2021-01-16', 1 from dual union all
6 select 1, 3, date '2021-01-16', 2 from dual union all
7 select 1, 5, date '2021-01-16', 3 from dual union all
8 --
9 select 2, 9, date '2021-01-01', 1 from dual union all
10 select 2, 24, date '2021-01-01', 3 from dual union all
11 select 2, 16, date '2021-01-01', 3 from dual union all
12 select 2, 18, date '2021-01-16', 1 from dual union all
13 select 2, 7, date '2021-01-16', 2 from dual union all
14 select 2, 1, date '2021-01-16', 3 from dual
15 ),
16 temp as
17 (select t.*,
18 row_number() over (partition by site_id, rpt_date order by volume desc) rn
19 from test t
20 )
21 select site_id, volume, rpt_date, rpt_hour
22 from temp
23 where rn = 1
24 /
SITE_ID VOLUME RPT_DATE RPT_HOUR
---------- ---------- ---------- ----------
1 13 01/01/2021 3
1 11 01/16/2021 1
2 24 01/01/2021 3
2 18 01/16/2021 1
SQL>
CodePudding user response:
One option would be using MAX(..) KEEP (DENSE_RANK ..) OVER (PARTITION BY ..)
analytic function without need of any subquery such as :
SELECT DISTINCT
site_id,
MAX(volume) KEEP (DENSE_RANK FIRST ORDER BY volume DESC) OVER
(PARTITION BY site_id, rpt_date) AS volume,
rpt_date,
MAX(rpt_hour) KEEP (DENSE_RANK FIRST ORDER BY volume DESC) OVER
(PARTITION BY site_id, rpt_date) AS rpt_hour
FROM t
GROUP BY site_id, rpt_date, volume, rpt_hour
ORDER BY site_id, rpt_date