Is it possible to consolidate records from a SELECT statement that have duplicate column values with-CodePudding

Consider that I have the following tables/entities:

Posts

id	user_id	title	content
1	1	Article 1	Lorem ipsum
2	1	Article 1	Lorem ipsum
3	1	Article 2	Lorem ipsum 2
4	2	Article 3	Lorem ipsum

Users

id	name
1	John Doe
2	Timothy Fisher

Notice that there are two posts from the user with an ID of 1 that have the same title and content. There was an error at the application level that allowed a user to submit a post twice in the past, leading to "duplicate" records.

I'm looking to query all of the posts, but consolidate the posts for each user that have duplicate titles and content.

The ideal result set would look like this:

post_id	author_name	title	content
1	John Doe	Article 1	Lorem ipsum
3	John Doe	Article 2	Lorem ipsum 2
4	Timothy Fisher	Article 3	Lorem ipsum

SELECT
  posts.id as post_id,
  users.name as author_name
  posts.title,
  posts.content
FROM
  posts
INNER JOIN
  users
ON
  posts.user_id = users.id;

Whether or not the query pulled post 1 or 2 for John Doe wouldn't matter. In the actual databases, I have timestamps so I'd likely just pull the latest one.

Is this possible with SQL?

CodePudding user response：

you could use a fake aggregation function and group by

SELECT 
  min(posts.id) as post_id,
  users.name as author_name
  posts.title,
  posts.content
FROM  posts
INNER JOIN  users  ON   posts.user_id = users.id
GROUP BY   users.name, posts.title, posts.content

CodePudding user response：

In MySQL you can also do it without aggregation. Yet I'm not sure if it's a better way (more info here).

SELECT
  posts.id as post_id,
  users.name as author_name,
  posts.title,
  posts.content
FROM
  posts
INNER JOIN
  users
ON
  posts.user_id = users.id
GROUP BY
  users.name, posts.title, posts.content;