Home > Software design >  Find occurrences of a record over the last 10 days in mysql
Find occurrences of a record over the last 10 days in mysql

Time:12-07

Consider the mysql table,

user_event| created_at
1          2021-12-03
2          2021-12-03
3          2021-12-03
3          2021-12-01
1          2021-11-28
2          2021-11-10
2          2021-11-01
1          2021-11-01

I am trying to find count of users grouped by created_at & count of repeat users if they had an event in the last 10 days

The expected result would be:

created_at | total_users | repeat_users
2021-12-03    3            2
2021-12-02    0            0
2021-12-01    1            0
.
.
2021-11-28    1            0
.
.
2021-11-10    1            1
2021-11-01    2            0

Basically, user_event 1,2,3 had activities on 2021-12-03, so total_users are 3. 2 of them had activity in the last 10 days so, repeat_users should be 2. Similarly, user_event 3 was present on 2021-12-01 but he has no activity in last 10 days, hence repeat_users would be 0 and so on.

CodePudding user response:

Given your table of data

CREATE TABLE MyData
(
    user_event INT,
    created_at DATE
);

INSERT INTO MyData
VALUES
(1,'2021-12-03'),
(2,'2021-12-03'),
(3,'2021-12-03'),
(3,'2021-12-01'),
(1,'2021-11-28'),
(2,'2021-11-10'),
(2,'2021-11-01'),
(1,'2021-11-01')

You can solve this using a calendar table which can be created as follows (you probably should populate with dates well into the future)

delimiter //

CREATE PROCEDURE CreateDateTable ()
BEGIN
    SET @StartDate = CAST('2021-01-01' AS DATE);
    SET @EndDate = CAST('2021-12-31' AS DATE);

    CREATE TABLE Calendar
    (
        DateValue DATE
    );

    WHILE @StartDate <= @EndDate DO
        INSERT INTO Calendar 
        (
            DateValue
        ) 
        VALUES 
        (
        @StartDate
        );
        
        SET @StartDate = DATE_ADD(@StartDate, INTERVAL 1 DAY);
    END WHILE;
END//

CALL CreateDateTable

DROP PROCEDURE CreateDateTable

Once you have the calendar table, you can select the dates you are interested in and use a combination of COUNT / GROUP BY to count the number of user events for the day and then join to a subquery which uses a self join to find the number of repeat events:

SELECT  c.DateValue,
        COUNT(d.user_event) AS total_users,
        IFNULL(r.NumRepeats,0) AS repeat_users
FROM    Calendar c
        LEFT JOIN MyData d
            ON c.DateValue = d.created_at
        LEFT JOIN ( 
                        /* repeats */
                        SELECT  a.created_at,
                                COUNT(a.user_event) AS NumRepeats
                        FROM    MyData a
                                LEFT JOIN MyData b
                                    ON b.created_at > DATE_ADD(a.created_at, INTERVAL -10 DAY) AND
                                        b.created_at < a.created_at AND
                                        a.user_event = b.user_event
                        WHERE   b.user_event IS NOT NULL
                        GROUP BY a.created_at
                    ) r
            ON r.created_at = c.DateValue
WHERE   c.DateValue >= '2021-11-01' AND
        c.DateValue <= '2021-12-03'
GROUP BY c.DateValue,
        r.NumRepeats
ORDER BY c.DateValue DESC
  • Related