I am trying to select all the columns which will be the same based on the grouping
test_table
------ ------- --------- ---------- ----------- --------------
| age | name | score | col1 | col2 | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 20 | joe | 10 | DING | DONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 20 | joe | 20 | DING | DONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 22 | sue | 25 | SING | SONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 22 | sue | 10 | SING | SONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 50 | bob | 25 | RING | WRONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 44 | joe | 15 | THING | THONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
The output that I am looking for would be:
------ ------- --------- ---------- ----------- --------------
| age | name |sum(score| col1 | col2 | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 20 | joe | 30 | DING | DONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 22 | sue | 35 | SING | SONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 50 | bob | 25 | RING | WRONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
| 44 | joe | 15 | THING | THONG | col3...col50 |
------ ------- --------- ---------- ----------- --------------
I know this isn't right, but my the general thought process is:
select
min(*),
sum(score)
from test_table
group by age, name
I want to avoid doing something like:
select
min(col1),
min(col2),
... cont ...,
min(col50),
sum(score)
from ...
CodePudding user response:
You can’t avoid listing all the columns individually. Also, if all the columns where you are using min have the same values for each combination of group by columns, then using min will be very inefficient - just list them in your select and group by clauses
CodePudding user response:
You can use DISTINCT ON
to get one row per group and join that with total scores calculated by a GROUP BY
query. With this approach there will be score
column containing value from some row in a group and a separate column for total score.
WITH total_scores AS (
SELECT age, name, SUM(score) AS total_score
FROM test_table
GROUP BY age, name
)
SELECT DISTINCT ON (tt.age, tt.name)
tt.*, ts.total_score
FROM test_table tt
JOIN total_scores ts ON tt.age = ts.age AND tt.name = ts.name
That said, it seems that you could normalize your data into two tables, one containing rows that have duplicate values (i.e. everything else except score
) and another table containing score
and a foreign key pointing to the first table.