Home > OS >  Skip rows with a duplicate column value?
Skip rows with a duplicate column value?

Time:06-24

Say I'm returning a collection using findAll(), but a number of the rows have identical values in one column.

-----------------------------
| ID     name    placeId    |
-----------------------------
| 1      abc     123        |
| 2      def     123        | <- skip this (duplicate placeId)
| 3      ghi     456        |
| 4      jkl     789        |
| 5      mno     576        |
| 6      pqr     576        | <- skip this (duplicate placeId)
----------------------------

Is there a means in Sequelize to skip any rows (except the first) with duplicate column values?

Query's:

const items = await models.Item.findAll({
where: {
    [Op.and]: [
        sequelize.where(distance, { [Op.lte]: radius }),
        sequelize.where(descriptionFilter, { [Op.gte]: 2 }),
        {
            name: {
                [Op.notLike]: `%Example%`
            }
        }, {
            userId: {
                [Op.not]: me.id
            }
        },
    ]
},
order: distance,
limit: 25,
})

CodePudding user response:

This is a gaps and islands problem. It would be easiest to handle this using a raw Postgres query, so I suggest the following:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (ORDER BY ID) rn1,
              ROW_NUMBER() OVER (PARTITION BY placeId ORDER BY ID) rn2
    FROM yourTable
),
cte2 AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY placeId, rn1 - rn2
                                 ORDER BY ID) rn
    FROM cte
)

SELECT ID, name, placeId
FROM cte2
WHERE rn = 1
ORDER BY ID;

Demo

CodePudding user response:

sequelize supports aggregation function min(), so you can apply this on our code.

const items = await models.Item.findAll({
    attributes: [
        [sequelize.fn('MIN', sequelize.col('ID')), 'ID'],
    ],
    where: {
    [Op.and]: [
        sequelize.where(distance, { [Op.lte]: radius }),
        sequelize.where(descriptionFilter, { [Op.gte]: 2 }),
        {
            name: {
                [Op.notLike]: `%Example%`
            }
        }, {
            userId: {
                [Op.not]: me.id
            }
        },
        group: ['placeId'],
    ]
},
order: distance,
limit: 25,
})

CodePudding user response:

Postgresql can do a job

select distinct on (placeid)
 * from example_table order by placeid,id 

DB Fiddle

CodePudding user response:

Maybe not the best, but it solves the problem:

  SELECT T2.ID, T2.NAME, T2.PLACEID
  FROM
  (
    SELECT T1.*, RANK() OVER(PARTITION BY placeId ORDER BY ID) RNK
    FROM test_tbl T1
  ) T2
  WHERE T2.RNK = 1
  ORDER BY T2.ID

Fiddle

  • Related