Home > OS >  How to join results of different with clause
How to join results of different with clause

Time:08-25

I have the following data structure:

table user
user_id | year_month | fruit  
------------------------------
1       | 2021-01    | orange
1       | 2021-01    | apple
1       | 2021-01    | melon
1       | 2021-01    | orange
1       | 2021-02    | kiwi
1       | 2021-02    | orange
1       | 2021-02    | banana
1       | 2021-02    | melon
1       | 2021-02    | orange
1       | 2021-03    | melon
1       | 2021-03    | orange
1       | 2021-03    | melon
1       | 2021-03    | banana
1       | 2021-03    | apple

I want to generate this output

 user_id | year_month |            fruits            |  most_frequent
---------------------------------------------------------------------
1        | 2021-01    | orange, apple, melon         | orange
1        | 2021-02    | kiwi, orange, banana, melon  | orange
1        | 2021-03    | melon, orange, banana, apple | orange

It is basically a grouping of the fruits consumed by year_month. Notice that the last column it is the most consumed fruit of all the time and not only the month.

This is my code so far:

WITH user AS (

 -- logic to retrieve the results from 2 other tables and aggregate into 1 temporary table
 -- contains a function inside

), ranking AS (
    SELECT
        user_id,
        fruit,
        COUNT(*) AS most_frequent, 
        FROM
            user
        GROUP BY 
            1, 2
        ORDER BY 
             3 desc
        -- LIMIT 1
)

SELECT
  user.user_id,
  user.year_month,
  STRING_AGG(distinct user.fruit, ', ') as fruits,
FROM user
LEFT JOIN ranking 
ON user.user_id = ranking.user_id
WHERE user.user_id = 1
GROUP BY  1
        , 2
ORDER BY 2

and this is my output:

 user_id | year_month |            fruits            
-----------------------------------------------------
1        | 2021-01    | orange, apple, melon         
1        | 2021-02    | kiwi, orange, banana, melon  
1        | 2021-03    | melon, orange, banana, apple

I'm having problems with the most frequent column. My initial solution was to uncomment that limit 1 line in the WITH ranking AS and include ranking.fruit into the main select, but this way I got no data returned.

To be more confusing for me, if I run this ranking code outside the with clause - including with the limit line, I got the desired grouping I mentioned in the 'initial solution'. So the problem seems to be related to some join question.

CodePudding user response:

Consider below approach

select * from (
  select user_id, year_month, 
    string_agg(distinct fruit) as fruits
  from user
  group by  user_id, year_month
) join (
  select user_id, fruit
  from user
  group by user_id, fruit
  qualify 1 = row_number() over(partition by user_id order by count(*) desc)
)
using (user_id)    

if applied to sample data in your question - output is

enter image description here

  • Related