Home > Back-end >  In SQL Developer, how can I simulate nested for loops to extract a large volume of counts?
In SQL Developer, how can I simulate nested for loops to extract a large volume of counts?

Time:10-30

I am using Oracle SQL Developer and I need to pull counts from a very large database. I need to count both the number of visits and the number of unique visitors at each of 5 sites (and in total) for each of 10 quarters (and in total) - resulting in 120 counts. Each row in this database represents one visit, and the visitors each have a unique visitor_ID.

Currently, I have one line for each count, but I need to make modifications and I don't want to do it in such an inefficient way this time.

select
sum(case when visit_date between '01-JAN-2019 00:00:00' and '31-MAR-2019 00:00:00' and site=site1 then 1 else 0 end) as 19Q1_visits_site1,
count(distinct case when visit_date between '01-JAN-2019 00:00:00' and '31-MAR-2019 00:00:00' and site=site1 then visitor_id) as 19Q1_unique_site1,
[...]
from visitdata
where [additional qualifiers];

If possible, I would like to create something along the lines of:

19Q1 = visit_date between '01-JAN-2019 00:00:00' and '31-MAR-2019 00:00:00'
19Q2 = visit_date between '01-APR-2019 00:00:00' and '30-JUN-2019 00:00:00'
[...]
allQ = visit_date between '01-JAN-2019 00:00:00' and '30-SEP-2021 00:00:00'

S1 = site in (site1)
[...]
allS = site in (site1, site2, site3, site4, site5)

sites = [S1, S2, S3, S4, S5, allS]
quarters = [19Q1, 19Q2, ..., allQ]

for s in sites:
    for q in quarters:
        select
        sum(case when q and s then 1 else 0 end) as (str(q)   'visits'   str(s)),
        count(distinct case when q and s then visitor_id) as (str(q)   'unique'   str(s))
        from visitdata
        where [additional qualifiers];

I know SQL doesn't do for loops. Any advice would be fantastic so I don't have to create another embarrassing script with almost 200 lines. Thanks!

CodePudding user response:

Very briefly, use a built-in or user-defined function that takes a date and returns the quarter to which that date belongs. (SQL Server supports "quarter" as a datepart, but you can probably write this yourself if Oracle doesn't support it. You could also add visit_quarter to your table as a materialized computed column and even index it if you use this a lot.) Then you can write a single grouping query along the lines of

SELECT
  site,
  quarter(visit_date) as Q,
  COUNT(visitor_id) as numvisits,
  COUNT(DISTINCT visitor_id) AS numDistinctVisitors
FROM T
WHERE <additional conditions>
GROUP BY site, quarter(visit_date)
ORDER BY site, Q

CodePudding user response:

For finding quarters you can use TO_CHAR() on dates, and for changing the start date and end date of a quarter, you can use ADD_MONTHS() eg

Table

create table randomdates( date_ )
as
select sysdate - dbms_random.value( 1, 600 )
from dual 
connect by level <= 50 ;

Query

select date_
, to_char( date_, 'YYYY-Q' ) quarter
from randomdates 
;

-- result
DATE_       QUARTER
09/23/2021  2021-3
09/24/2020  2020-3
03/23/2020  2020-1
03/29/2021  2021-1
11/29/2020  2020-4
03/05/2021  2021-1
04/08/2021  2021-2
...

GROUP BY should also be possible - as @Steve Kass suggested.

select to_char( date_, 'YYYY-Q' ), count(*)
from randomdates 
group by to_char( date_, 'YYYY-Q' )
order by 1 desc
;


TO_CHAR(DATE_,'YYYY-Q') COUNT(*)
2021-4  6
2021-3  13
2021-2  22
2021-1  13
2020-4  20
2020-3  12
2020-2  10
2020-1  4
...

From your comment:

I'm using a modified fiscal year anyway. Do you know how I can define quarters when the visit_date is stored as yyyymmdd?

If you need different start/end days for the, ADD_MONTHS() could help eg the modified_quarters in the following query start a month later than the "standard" quarters. Regarding the dates: in Oracle, they are always stored as YYYYMMDDHHMISS, you can just use TO_CHAR() and pick up whichever component (of the date) you need via the "Format Model" eg 'Q' in the example below.

-- Query executed in APEX.  
-- Column date_ : no formatting (compare the output to the same query in the dbfiddle).
select date_ 
, to_char( date_, 'YYYY-Q' ) quarter
, to_char( date_, 'YY' ) || 'Q' || to_char( date_, 'Q' ) quarter_
, to_char( add_months( date_, 1 ), 'YYYY-Q' ) modified_quarter
from randomdates 
;

DATE_       QUARTER QUARTER_ MODIFIED_QUARTER
09/23/2021  2021-3  21Q3     2021-4
09/24/2020  2020-3  20Q3     2020-4
03/23/2020  2020-1  20Q1     2020-2
03/29/2021  2021-1  21Q1     2021-2
11/29/2020  2020-4  20Q4     2020-4
03/05/2021  2021-1  21Q1     2021-2

For calculating subtotals and total (counts) per site, you could use GROUP BY ROLLUP() eg

Table & data

-- Caution: dates in this table are not the same as in the randomdates table.
create table sitesanddates( site_, date_ )
as
select 
  trunc( dbms_random.value( 1, 6 ) )
, sysdate - dbms_random.value( 1, 600 )
from dual 
connect by level <= 50 ;

-- group by site and quarter
select site_, to_char( date_, 'YYYY-Q' ), count(*)
from sitesanddates 
group by site_, to_char( date_, 'YYYY-Q' )
order by 1, 2
;

SITE_   TO_CHAR(DATE_,'YYYY-Q') COUNT(*)
1       2020-1                  1
1       2020-4                  3
1       2021-1                  1
1       2021-2                  1
1       2021-3                  2
2       2020-1                  1
2       2020-2                  1

GROUP BY ROLLUP

select site_, to_char( date_, 'YYYY-Q' ) q_, count(*)
from sitesanddates 
group by rollup( site_, to_char( date_, 'YYYY-Q' ) )
order by 1, 2
;


SITE_   Q_  COUNT(*)
1   2020-1  1
1   2020-4  3
1   2021-1  1
1   2021-2  1
1   2021-3  2
1   -       8  -- <- subtotal for site 1
2   2020-1  1
2   2020-2  1
...
5   2020-4  2
5   2021-1  2
5   2021-2  1
5   2021-4  1 
5   -      10  -- <- subtotal for site 5
-   -      50  -- <- grand total

link to dbfiddle

  • Related