Home > OS >  Use DISTINCT for 2 columns
Use DISTINCT for 2 columns

Time:10-11

I have this Postgres SQL query in which I would like to add DISTINCT:

SELECT pairs, a.change_id, user_size, user_mile, b.change_short_name
FROM order_data a
FULL OUTER JOIN changes b
ON a.change_id = b.change_id
ORDER BY a.created_at ASC;

I tried this:

SELECT DISTINCT pairs, a.change_id, user_size, user_mile, b.change_short_name
FROM order_data a
FULL OUTER JOIN changes b
ON a.change_id = b.change_id
ORDER BY a.created_at ASC;

I get error: [42P10] ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list Position: 280

I need to use DISTINCT for pairs, a.change_id and biggest a.created_at.

What is the proper way to implement this?

CodePudding user response:

Do a GROUP BY instead of SELECT DISTINCT. Use MAX() in the ORDER BY clause:

SELECT pairs, a.change_id, user_size, user_mile, b.change_short_name
FROM order_data a
FULL OUTER JOIN changes b
  ON a.change_id = b.change_id
GROUP BY pairs, a.change_id, user_size, user_mile, b.change_short_name
ORDER BY MAX(a.created_at) ASC;

CodePudding user response:

Use DISTINCT ON. Pls. note the ORDER BY clause. Some good examples/discussion here.

SELECT DISTINCT ON (pairs, a.change_id)
    pairs, a.change_id, user_size, user_mile, b.change_short_name
FROM order_data a
FULL OUTER JOIN changes b ON a.change_id = b.change_id
ORDER BY pairs, a.change_id, a.created_at DESC;

CodePudding user response:

Maybe this:

SELECT pairs, change_id, user_size, user_mile, change_short_name
FROM  (
   SELECT DISINCT ON (pairs, change_id)
          pairs, change_id, user_size, user_mile, b.change_short_name, a.created_at
   FROM   order_data a
   FULL   JOIN changes b USING (change_id)
   ORDER  BY pairs, change_id, a.created_at
ORDER  BY created_at;

Note the USING clause which merges a.change_id and b.change_id to effectively COALESCE(a.change_id, b.change_id).

The outer SELECT is to eliminate created_at from the result while still sorting by it. It's not entirely clear from the question whether you want that. Your original query does it.

Details depend on undisclosed information: exact table definition, cardinalities, exact requirements. There may be much faster solutions for many duplicates.

See:

  • Related