I have already tried Get records with max value for each group of grouped SQL results answers from here. But it takes only group by one column. I need several. So I am posting the question here.
I have an dataset(usertable) as follows,
ID Username Status
1 John Active
2 Mike Active
3 Ann Deactive
4 Leta Active
5 Lena Active
6 Lara Active
7 Mitch Active
Further I have a revenuetb table as follows which has substrings of each user.
subuser hour Revenue
John_01 2/26/2022 5:00 5
Mike_01 2/26/2022 7:00 8
Mike_01 2/26/2022 7:00 22
Leta_03 2/26/2022 7:00 67
Leta_07 2/26/2022 9:00 56
Mitch_07 2/26/2022 11:00 34
I need to match each subuser with main User(ex:John_01==John) in usertable and get the sum of revenue of each user as well. But if the Hour is repeated for each subuser I need to get the maximum Revenue of that each sub user.
As an example I need to get the revenue with 22 and should neglect 8.
Ex:
Mike_01 2/26/2022 7:00 8
Mike_01 2/26/2022 7:00 22
So final table should looks like below.
User Total_Usage
John 5
Mike 22
Leta 123
Lena 0
Lara 0
Mitch 34
Here is my try.
SELECT u.Username
, COALESCE(SUM(Revenue), 0) AS TOTAL_USAGE
FROM usertable u
LEFT JOIN revenuetb e
ON SUBSTRING_INDEX(e.subuser, '_', 1) = u.Username
AND e.Hour BETWEEN 'XXX' and 'XXX'
where u.STATUS='Active'
GROUP BY u.Username
order by u.ID
;
But this is not neglecting repeated Hours. Can someone explain how should I do this?
Limitation :cannot use window functions in my remote server.
CodePudding user response:
Just use a derived table to obtain the MAX(revenue)
per (user, hour)
groups.
Then SUM the results per user:
SELECT username, SUM(MAX_USAGE) AS TOTAL_USAGE
FROM (
SELECT u.Username, e.hour
, COALESCE(MAX(Revenue), 0) AS MAX_USAGE
FROM usertable u
LEFT JOIN revenuetb e
ON SUBSTRING_INDEX(e.subuser, '_', 1) = u.Username
AND e.Hour BETWEEN xxx and yyy
WHERE u.STATUS='Active'
GROUP BY u.Username, e.hour
) AS derived
GROUP BY Username
;
Result:
username | TOTAL_USAGE |
---|---|
John | 5 |
Lara | 0 |
Lena | 0 |
Leta | 123 |
Mike | 22 |
Mitch | 34 |
Slight cheat in the fiddle with regard to the date/times. But the basic logic is the same.
For your initial SQL, I used MariaDB to avoid your GROUP BY error report. MariaDB still doesn't recognize the functional dependence mistake, much like the older versions of MySQL (prior to 5.7).
This is allowed with MySQL prior to 5.7 as well.
The final solution works fine in all versions as it doesn't contain the functional dependence mistake.