Home > OS >  Oracle SQL - selecting only records that appear in each month of a year
Oracle SQL - selecting only records that appear in each month of a year

Time:02-04

I have a data set of monthly billing records, but I want to filter out newer accounts that don't have a 12 month history and I'm not really sure how to approach this.

I was thinking about something using row number = 12 but not sure how to do that at the level of detail for each account and not the entire dataset. Also saw a post here about having count(date) = 12, but I don't think I'm executing that correctly.

CodePudding user response:

You use the windowing capability of aggregate functions, specifying a PARTITION BY.

SELECT *
  FROM (SELECT x.*,
               COUNT(*) OVER (PARTITION BY account_no) cnt
          FROM monthly_billing_records x)
 WHERE cnt >= 12

What Oracle will do is generate that query block results, then make a second internal pass over those results and do a GROUP BY whatever your PARTITION BY column list is, compute the COUNT(*), and join that back to the results on the PARTITION BY columns. That way you can summarize multiple rows and report the result on each individual row. It can do a lot more complicated stuff than that, so read up on PARTITION BY, ROWS BETWEEN , etc..

CodePudding user response:

If you want the id for accounts with 12 months of data (the current month and the preceding 11 months) then you can GROUP BY the account's id and COUNT the DISTINCT dates within each month (WHERE the dates are within the expected time period) and then to only get those with a full 12 month history you can use a HAVING clause:

SELECT id,
       COUNT(DISTINCT TRUNC(dt, 'MM')) AS num_months
FROM   table_name
WHERE  dt <  ADD_MONTHS(TRUNC(SYSDATE, 'MM'),   1)
AND    dt >= ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -11)
GROUP BY id
HAVING COUNT(DISTINCT TRUNC(dt, 'MM')) = 12;

Which, for the sample data:

CREATE TABLE table_name (id, dt) AS
  -- Every month for the current month and past 11 months
  SELECT 1, ADD_MONTHS(TRUNC(SYSDATE, 'MM'), 1 - LEVEL) FROM DUAL CONNECT BY LEVEL <= 12
UNION ALL
  -- 12 months of historical data but not for the current month
  SELECT 2, ADD_MONTHS(TRUNC(SYSDATE, 'MM'), 0 - LEVEL) FROM DUAL CONNECT BY LEVEL <= 12
UNION ALL
  -- Only 4 months of data
  SELECT 3, ADD_MONTHS(TRUNC(SYSDATE, 'MM'), 1 - LEVEL) FROM DUAL CONNECT BY LEVEL <= 4;

Outputs:

ID NUM_MONTHS
1 12

fiddle

  • Related