I have this toy example which gives me sparse table of values separated in their different categories. I would want to have dense matrix, where all columns are individually ordered.
drop table if exists temp_table;
create temp table temp_table(
rowid int
, category text
, score int
);
insert into temp_table values (0, 'cat1', 10);
insert into temp_table values (1, 'cat2', 21);
insert into temp_table values (2, 'cat3', 32);
insert into temp_table values (3, 'cat2', 23);
insert into temp_table values (4, 'cat2', 24);
insert into temp_table values (5, 'cat3', 35);
insert into temp_table values (6, 'cat1', 16);
insert into temp_table values (7, 'cat1', 17);
insert into temp_table values (8, 'cat2', 28);
insert into temp_table values (9, 'cat2', 29);
Which gives this temporary table:
rowid | category | score |
---|---|---|
0 | cat1 | 10 |
1 | cat2 | 21 |
2 | cat3 | 32 |
3 | cat2 | 23 |
4 | cat2 | 24 |
5 | cat3 | 35 |
6 | cat1 | 16 |
7 | cat1 | 17 |
8 | cat2 | 28 |
9 | cat2 | 29 |
Then ordering score values to different columns based on their category:
select "cat1", "cat2", "cat3"
from crosstab(
$$ select rowid, category, score from temp_table $$ -- as source_sql
, $$ select distinct category from temp_table order by category $$ -- as category_sql
) as (rowid int, "cat1" int, "cat2" int, "cat3" int)
That outputs:
cat1 | cat2 | cat3 |
---|---|---|
10 | ||
21 | ||
32 | ||
23 | ||
24 | ||
35 | ||
16 | ||
17 | ||
28 | ||
29 |
But I would want the result of the query to be dense, like:
cat1 | cat2 | cat3 |
---|---|---|
10 | 21 | 32 |
16 | 23 | 35 |
17 | 24 | |
28 | ||
29 |
Maybe PostgreSQL's crosstab is not even right tool to do this, but that comes to my mind first as it produces that sparse table close to the result I would need.
CodePudding user response:
This should work for the exact given example data and expected output.
select max(cat1), max(cat2), max(cat3)
from crosstab(
$$ select rank() over(partition by category order by rowid) as ranking,
rowid,
category,
score
from temp_table
order by rowid, category asc$$ -- as source_sql
, $$ select distinct category
from temp_table
order by category $$ -- as category_sql
) as (ranking int, rowid int, "cat1" int, "cat2" int, "cat3" int)
group by ranking
order by ranking asc
You can test the solution here -
- In the external query, I am effectively picking the
max()
values of each category, for each of the rankings as obtained in the source SQL query.
CodePudding user response:
with cte as (
select category, score, row_number() over (
partition by category order by score
) as r
from temp_table
)
select
sum(score) filter (where category = 'cat1') as cat1,
sum(score) filter (where category = 'cat2') as cat2,
sum(score) filter (where category = 'cat3') as cat3
from cte
group by r
order by r
;
If the number of columns is known and it is reasonably small, FILTER
might be a better option than CROSSTAB
, which requires an extension.