duplication of rows in table-CodePudding

I have a table which has many rows which are same, except for the id column. How can I show only one row for other duplicate row?

id    name     roll_number
1      a         1         
2      b         2         
3      a         1         
4      b         2       
5      c         3         
6      d         4         
7      d         4

show output like this

id    name     roll_number
1      a         1         
2      b         2                   
5      c         3         
6      d         4

CodePudding user response：

We can use DISTINCT ON here:

SELECT DISTINCT ON (name) id, name, roll_number
FROM yourTable
ORDER BY name, id;

This query is selecting one record with the lowest id from each group of records having the same name.

CodePudding user response：

Simple aggregation using min

select Min(id), name,roll_number
from t
group by name, roll_number

CodePudding user response：

You could use the numpy.unique(filt, trim='fb') function:

>>> import numpy as np
>>> np.unique(array)

CodePudding user response：

This problem requires to "filter out" tuples during the projection based on groups. The solution is to use distinct on.

SELECT DISTINCT ON (name, roll_number) id, name, roll_number
FROM table
ORDER BY name, id;

it basically creates groups by the attributes within the "DISTINCT_ON" and non-deterministically chooses one tuple, which it outputs.