I would like to count how many new unique users the database gets each day for all days recorded. There will not be any duplicate ids per day, but there will be duplicates over multiple days.
If my table looks like this :
ID | DATE
---------
1 | 2022-05-21
1 | 2022-05-22
2 | 2022-05-22
1 | 2022-05-23
2 | 2022-05-23
1 | 2022-05-24
2 | 2022-05-24
3 | 2022-05-24
I would like the results to look like this :
DATE | NEW UNIQUE IDs
---------------------------
2022-05-21 | 1
2022-05-22 | 1
2022-05-23 | 0
2022-05-24 | 1
A query such as :
SELECT `date` , COUNT( DISTINCT id)
FROM tbl
GROUP BY DATE( `date` )
Will return the count per day and will not take into account previous days.
Any assistance would be appreciated.
Edit : Using MySQL 8
CodePudding user response:
The user is new when the date is the least date for this user.
So you need in something like
SELECT date, COUNT(new_users.id)
FROM calendar
LEFT JOIN ( SELECT id, MIN(date) date
FROM test
GROUP BY id ) new_users USING (date)
GROUP BY date
calendar
is either static or dynamically generated table with needed dates list. It can be even SELECT DISTINCT date FROM test
subquery.
CodePudding user response:
Start with a subquery showing the earliest date where each id appears.
SELECT MIN(`date`) `firstdate`, id
FROM tbl
GROUP BY id
Then do your count on that subquery. here.
SELECT firstdate, COUNT(*)
FROM (
SELECT MIN(`date`) `firstdate`, id
FROM tbl
GROUP BY id
) m
GROUP BY firstdate
That gives you what you want.
But it doesn't have rows for the dates where no new user ids first appeared.
CodePudding user response:
Only count (and sum) the rows where the left join fails:
SELECT
m1.`DATE` ,
sum(CASE WHEN m2.id is null THEN 1 ELSE 0 END) as C
FROM mytable m1
LEFT JOIN mytable m2 ON m2.`DATE`<m1.`DATE` AND m2.ID=m1.ID
GROUP BY m1.`DATE`
see: DBFIDDLE
CodePudding user response:
SELECT a.`date` , COUNT( DISTINCT a.id)
FROM tbl a
WHERE a.`DATE` = (SELECT MIN(b.`date`) FROM tbl b WHERE a.`id`=b.`id`)
GROUP BY DATE( a.`date` )