Home > Net >  Can I get the full rows when using group by multiple columns?
Can I get the full rows when using group by multiple columns?

Time:04-25

If the date, item, and category are the same in the table, I'd like to treat it as the same row and return n rows out of them(ex: if n is 3, then limit 0, 3).

------------------------------------------
id   |  date   | item   | category   | ...
------------------------------------------
101  | 20220201| pencil | stationery | ...    <---
------------------------------------------        |  treat as same result
105  | 20220201| pencil | stationery | ...    <---  
------------------------------------------
120  | 20220214| desk   | furniture  | ...
------------------------------------------
125  | 20220219| tongs  | utensil    | ...    <---
------------------------------------------        |  treat as same
129  | 20220219| tongs  | utensil    | ...    <--- 
------------------------------------------
130  | 20220222| tongs  | utensil    | ...

expected results (if n is 3)

-----------------------------------------------
id   |  date   | item   | category   | ... rank
-----------------------------------------------
101  | 20220201| pencil | stationery | ...  1  
-----------------------------------------------       
105  | 20220201| pencil | stationery | ...  1  
-----------------------------------------------
120  | 20220214| desk   | furniture  | ...  2
-----------------------------------------------
125  | 20220219| tongs  | utensil    | ...  3
-----------------------------------------------
129  | 20220219| tongs  | utensil    | ...  3

The problem is that I have to bring the values of each group as well. If I have only one column to group by, I can compare id value with origin table, but I don't know what to do with multiple columns.

Is there any way to solve this problem?

For reference, I used a user variable to compare it with previous values, I couldn't use it because the duration was slow.

          SELECT 
            *,
            IF(@prev_date=date and @prev_item=item and @prev_category=category,@ranking, @ranking:=@ranking 1) AS sameRow,
            @prev_item:=item,
            @prev_date:= date,
            @prev_category:=category,
            @ranking
          FROM ( SELECT ...

I'm using Mysql 8.0 version and id value is not a continuous number because I have to order by before group by.

CodePudding user response:

if I understand correctly, you can try to use dense_rank window function and set order by with your expected columns

if date column can represent the order number I would put it first.

SELECT *
FROM (
    SELECT *,dense_rank() OVER(ORDER BY date, item, category) rnk
    FROM T 
) t1

SQLFIDDLE

CodePudding user response:

Window functions come in very handy in this situation. But for those of us still using MySQL 5.7, where functions such as row_number don't exist, we have to either resort to using a user variable and resetting the value every time before the main statement, or defining the user variable directly in the statement.
method 1

set @row_id=0; -- remember to reset the row_id to 0 every time before the main query below
select id,date,item,category,rank from testtb join 
    (
    select date,item,category, (@row_id:=@row_id 1) as rank
        from 
            (select date,item,category  from testtb group by date,item,category) t1
    ) t2 
using(date,item,category);

method 2

select id,date,item,category,rank from testtb join 
    (
    select date,item,category, (@row_id:=@row_id 1) as rank
        from 
            (select date,item,category  from testtb group by date,item,category) t1, (select @row_id := 0) as n
    ) t2 
using(date,item,category);
  • Related