Home > Software engineering >  Find days in months and calculate the Active days in a Month - Oracle
Find days in months and calculate the Active days in a Month - Oracle

Time:04-19

I have a table Activity_Calc.

  CREATE TABLE Activity_Calc ( 
  person_id   INTEGER NOT NULL PRIMARY KEY, 
  Activity_start_date DATE, 
  Activity_End_Date DATE
);
INSERT INTO Activity_Calc 
VALUES (123,TO_DATE( '05-23-2020', 'MM-DD-YYYY' ), TO_DATE( '11-19-2020', 'MM-DD-YYYY' ) );

INSERT INTO Activity_Calc 
VALUES (890,TO_DATE( '01-06-2021', 'MM-DD-YYYY' ), TO_DATE( '07-05-2021', 'MM-DD-YYYY' ) );

INSERT INTO Activity_Calc 
VALUES (231,TO_DATE( '02-14-2020', 'MM-DD-YYYY' ), TO_DATE( '08-12-2020', 'MM-DD-YYYY' ) );

The difference between the Activity_start_date and the Activity_End_Date is always 180 days. I need to create a table like below:

enter image description here

Here T1 means the most recent month. For example, for person_id 123, T1 is November, and he was active for 19 days in November. So for person_Id 123, T1 takes the value 19. He was active for whole of October which is my T2 here for person_id 123. So T2 takes 31(Total number of days in October). So on and so forth. And the last month, going backward, starting from November, is May in which he was active for 8 days. Hence T7 takes 8. It might happen for few customers, T7 won't even appear. Only T1 to T6 will suffice. In that case T7 takes 0.

https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=f128f82f5ffbe53a76f93c23e6f71876

CodePudding user response:

DBFiddle: https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=571b019196a91dc556b880302c7a2576

with gen as (
    select level-1 as n 
    from dual 
    connect by
        level<=1 (select max(Activity_End_Date-Activity_start_date) from Activity_Calc)
    )
select 
   person_id
  ,'T'||
    dense_rank()
      over(partition by person_id order by trunc(ac.Activity_start_date v.n,'mm') desc) x
  ,count(*) no_of_Active_days
from Activity_Calc ac
    join gen v
        on ac.Activity_start_date v.n<=ac.Activity_End_Date
group by person_id, trunc(ac.Activity_start_date v.n,'mm')
order by 1,2;

NB: Looks like your T7 is not correct or you haven't explained in details why you don't count all days of the first month.

CodePudding user response:

You can generate 12 numbers using CONNECT BY Clause and then treat them as months. Then join your table back and generate the desired rows -

WITH MONTHS AS(SELECT level months
                 FROM DUAL
              CONNECT BY LEVEL <= 12)
SELECT A.person_id,
       'T' || ROW_NUMBER() OVER(PARTITION BY person_id ORDER BY M.months) Period,
        CASE WHEN M.months = EXTRACT(MONTH FROM A.Activity_start_date)
                  THEN LAST_DAY(Activity_start_date) - Activity_start_date
             WHEN M.months = EXTRACT(MONTH FROM A.Activity_End_Date)
                  THEN EXTRACT(DAY FROM Activity_End_Date)
             ELSE EXTRACT(DAY FROM LAST_DAY(TO_DATE(M.months, 'MM'))) 
        END no_of_active_days, Activity_start_date
  FROM MONTHS M
  JOIN Activity_Calc A ON M.months BETWEEN EXTRACT(MONTH FROM A.Activity_start_date) 
                                       AND EXTRACT(MONTH FROM A.Activity_End_Date)
 ORDER BY person_id, M.months DESC;

Demo.

CodePudding user response:

You can use a recursive query to iterate over the period and find the start and end for each month and then calculate the days:

WITH months (person_id, month_id, activity_start_date, start_date, end_date) AS (
  SELECT person_id,
         1,
         CAST(activity_start_date   1 AS DATE),
         GREATEST(activity_start_date   1, TRUNC(activity_end_date, 'MM')),
         activity_end_date
  FROM   activity_calc
UNION ALL
  SELECT person_id,
         month_id   1,
         activity_start_date,
         GREATEST(activity_start_date, TRUNC(start_date - 1, 'MM')),
         start_date - 1
  FROM   months
  WHERE  start_date > activity_start_date
)
SEARCH DEPTH FIRST BY person_id SET order_id
SELECT person_id,
       'T' || month_id AS period,
       TO_CHAR(start_date, 'YYYY-MM') AS month,
       end_date - start_date   1 AS no_of_active_days
FROM   months

Which, for the sample data:

CREATE TABLE Activity_Calc ( 
  person_id           NOT NULL PRIMARY KEY, 
  Activity_start_date , 
  Activity_End_Date
) AS
SELECT 890, DATE '2021-01-06', DATE '2021-07-05' FROM DUAL UNION ALL
SELECT 123, DATE '2020-05-23', DATE '2020-11-19' FROM DUAL UNION ALL
SELECT 231, DATE '2020-02-14', DATE '2020-08-12' FROM DUAL UNION ALL
SELECT 321, DATE '2021-12-10', DATE '2021-12-10'   INTERVAL '180' DAY(3) FROM DUAL;

Outputs:

PERSON_ID PERIOD MONTH NO_OF_ACTIVE_DAYS
123 T1 2020-11 19
123 T2 2020-10 31
123 T3 2020-09 30
123 T4 2020-08 31
123 T5 2020-07 31
123 T6 2020-06 30
123 T7 2020-05 8
231 T1 2020-08 12
231 T2 2020-07 31
231 T3 2020-06 30
231 T4 2020-05 31
231 T5 2020-04 30
231 T6 2020-03 31
231 T7 2020-02 15
321 T1 2022-06 8
321 T2 2022-05 31
321 T3 2022-04 30
321 T4 2022-03 31
321 T5 2022-02 28
321 T6 2022-01 31
321 T7 2021-12 21
890 T1 2021-07 5
890 T2 2021-06 30
890 T3 2021-05 31
890 T4 2021-04 30
890 T5 2021-03 31
890 T6 2021-02 28
890 T7 2021-01 25

db<>fiddle here

CodePudding user response:

You can solve this problem in closed form - a little bit of date arithmetic (pencil on paper) can save a lot of computations in code.

Note my interpretation of the problem (detailed in Comments under your question): The periods you show in your sample data are 181 days if you include both the start and the end date. The much more common convention is that the last date is excluded: A "date" of 19-Nov-2020 means midnight at the beginning of the day, so the person was NOT active on that day. The count of active days in November should be 18, not 19. Also with this understanding, if the end date is say April 1, then the person becomes inactive at midnight at the end of March 31 - therefore the "last month" of activity for that person must be March, not April.

If "180 days" is correct but you must include the end date, then you must exclude the start date - which doesn't make sense. Or your periods are 181 days, not 180, if you must include both the start and the end date in the count. All these different conventions will lead to different results, but the query can be adapted for any of them.

So, anyway - here is the query that will work for my interpretation. To make it easier to follow the output, I included the months corresponding to your "periods" too; you can remove that column from select if needed.

with   g (lvl) as (select level from dual connect by level <= 7)
select person_id, 'T' || lvl as period,
       to_char(add_months(trunc(activity_end_date - 1, 'mm'), 1 - lvl),
                  'Mon-yyyy') as mth,          --  added for debugging
       greatest(0, 
           least   (activity_end_date, 
                    add_months(trunc(activity_end_date - 1, 'mm'), 2 - lvl)
                   )
         - greatest(add_months(trunc(activity_end_date - 1, 'mm'), 1 - lvl),
                    activity_start_date
                   )
       )   as no_of_active_days
from   activity_calc cross join g
order  by person_id, lvl
;

If your data is of meaningful size, you may want to try the different solutions and compare speed of execution.

  • Related