Home > OS >  How to effective get the max data in sub group and get the min data in among big groups?
How to effective get the max data in sub group and get the min data in among big groups?

Time:07-18

Firstly, I hope to get the max date in each sub-groups.

Group A = action 1 & 2

Group B = action 3 & 4

actionName action actionBy actiontime
999 1 Tom 2022-07-15 09:18:00
999 1 Tom 2022-07-15 15:21:00
999 2 Peter 2022-07-15 14:06:00
999 2 Peter 2022-07-15 14:08:00
999 3 Sally 2022-07-15 14:20:00
999 3 Mary 2022-07-15 14:22:00
999 4 Mary 2022-07-15 14:25:00

In this example:

The max time of group A is "1 | Tom | 2022-07-15 15:21:00 "

The max time of group B is " 4 | Mary | 2022-07-15 14:25:00 "

The final answer is "1 | Tom | 2022-07-15 14:25:00 ", which is the minimum data among groups.

I have a method how to get the max date in each group like the following code.

with cte1
as (select actionName,
           actiontime,
           actionBy,
           row_number() over (partition by actionName order by actiontime desc) as rn
    from actionDetails
    where action in ( '1', '2' )
    UNION
    select actionName,
           actiontime,
           actionBy,
           row_number() over (partition by actionName order by actiontime desc) as rn
    from actionDetails
    where action in ( '3', '4' )
   )
select *
from cte1
where rn = 1

ActionName is not PK. It would get the max data in each group.

Then, I don't know how to use an effective way to get the minimum data between group A and group B. Would you give me some ideas?

I know one of the methods is self join again. However, I think that is not the best solution.

CodePudding user response:

First of all, you can simplify your query by putting the action groups into the partition clause. Use a case expression to get one group for actions 1 and 2 and another for actions 3 and 4.

Then after getting the maximum dates per actionname and action group you want to get the minimum dates of these per actionname. This means you want a second CTE building up on the first one:

with max_per_group as
(
  select top(1) with ties
    actionname,
    actiontime,
    actionby
  from actiondetails
  where action in (1, 2, 3, 4)
  order by row_number() 
             over (partition by actionname, case when action <= 2 then 1 else 2 end
                   order by actiontime desc)
)
, min_of_max as
(
  select top(1) with ties
    actionname,
    actiontime,
    actionby
  from max_per_group
  order by row_number() over (partition by actionname order by actiontime)
)
select actionname, actiontime, actionby
from min_of_max
order by actionname;

As you see, instead of computing a row number and then have to limit rows based on that in the next query, I limit the rows right away by putting the row numbering into the ORDER BY clause and applying TOP(1) WITH TIES to get all rows numbered 1. I like this a tad better, because the CTE already produces the rows that I want to work with rather than only marking them in a bigger data set. But that's personal preference I guess.

Discaimer:

In my query I assume that the column action is numeric. If the column is a string instead, because it can hold values that are not numbers, then work with strings:

where action in ('1', '2', '3', '4')

partition by actionname, case when action in ('1', '2') then 1 else 2 end

If on the other hand the column is a string, but there are only numbers in that column, fix your table instead.

  • Related