Home > Blockchain >  How to use aggregate functions with joins?
How to use aggregate functions with joins?

Time:02-27

I have an main dataset(users) as follows.

ID  Username    Status
1   John        Active
2   Mike        Active
3   Ann         Deactive
4   Leta        Active
5   Lena        Active
6   Lara        Active
7   Mitch       Active

Further I have revenue table as follows.

subuser           hour          Revenue
John_01         2/26/2022 5:00     5
Mike_01         2/26/2022 7:00     8
Mike_02         2/26/2022 7:00     22
Leta_03         2/26/2022 7:00     67
Leta_07         2/26/2022 9:00     56
Mitch_07        2/26/2022 11:00    34

Now I need to get a table as follows.

User    Total Usage
John         5
Mike         22
Leta         123
Lena         0
Lara         0
Mitch        0

Here I need to get the sum of all hours of each user substring and match with main user table.Further if same hour is for same substring I need to get the maximum revenue value and other values should be neglect for that particular hour.

Ex:

Mike_01         2/26/2022 7:00     8
Mike_02         2/26/2022 7:00     22

Here Mike_01 2/26/2022 7:00 8 should neglect.

So I tried as below.

SELECT
u.Username,
COALESCE(SUM(Revenue), 0) AS TOTAL USAGE
FROM users u
LEFT JOIN revenuetable e
ON SUBSTRING_INDEX(e.subuser, '_', 1) = u.Username AND
   e.Hour BETWEEN 'XXX' and 'XXX'
where u.Status='Active'    
GROUP BY
u.Username
order by u.ID.

But this didn't get the maximum value if same hour repeats. Can someone show me where I messed this?

update: Do we have any method other tan using window functions?

CodePudding user response:

If using MySQL that supports row_number() then join to a derived table that removes the unwanted rows.

SELECT
u.Username,
COALESCE(SUM(Revenue), 0) AS TOTAL USAGE
FROM users u
LEFT JOIN (
Select *
, row_number() OVER(partition by SUBSTRING_INDEX(e.subuser, '_', 1), hour order by revenue DESC) rn
From revenuetable ) e
ON SUBSTRING_INDEX(e.subuser, '_', 1) = u.Username AND rn = 1
   e.Hour BETWEEN 'XXX' and 'XXX'
where u.Status='Active'    
GROUP BY
u.Username
order by u.ID

Introducing this function and the over clause will give precedence to the highest revenue in each hour per user as the 'rn' column will be 1 for each such row.

  • Related