Suppose I have a table which stores data for students and the respected grades per classes. In most cases, there exists multiple rows per student in the table.
student_id math chemistry science
100 A B C <--------
100 B A D
100 D F C
200 B A C <--------
300 C D F <--------
300 A A A
400 F C B <--------
400 B A C
500 A B A <--------
I want to retrieve the first row as per the student_id as explained below.
Requested in postgreSQL:
student_id math chemistry science
100 A B C
200 B A C
300 C D F
400 F C B
500 A B A
CodePudding user response:
This can be achieved with the following query:
SELECT DISTINCT ON (student_id) student_id, math, chemistry, science
FROM
student
ORDER BY
student_id
This query will return just a single row per student_id. You should determine how you want to actually order it (such as an index or timestamp), otherwise you can't guarantee that you get the same output each time.
But as a basic solution, you can use this to just get a single row if you don't care about the actual order and only care about removing duplicates.
CodePudding user response:
At the point you decide on an ordering strategy (such as adding a column that records the date they took the exams and defining "first" as "most recent") you can get the "first" row with something like
SELECT *
FROM(
SELECT *, ROW_NUMBER() OVER(PARTITION BY student_id ORDER BY date_taken DESC) rn
FROM yourtable
) x
WHERE rn = 1
The ROW_NUMBER creates an incrementing counter 1,2,3,4.. that restarts from 1 for every different student id, and the rows are numbered in order of descending date (so most recent gets 1). By then requiring rn to be 1 we get the most recent row
You might decide to give the student their best marks, and maybe we could use the ascii value of the score (A is 65, B is 66 etc). If we add the scores up then the lowest total (ie order by ascending) is the best set of marks (BBB is better than AAF)
OVER(PARTITION BY student_id ORDER BY ASCII(math) ASCII(chemistry) ASCII(science))
CodePudding user response:
The basic solution is - using MIN/MAX function to aggregate values:
SELECT student_id, MAX(math) math, MAX(chemistry) chemistry, MAX(science) science
FROM
student
GROUP BY
student_id