First of all, this is a database I was given, and it is a pain to work with.
I have this table:
content_id | version | content_version_id | alias | publish_date | contentattribute_name | contentattribute_value |
---|---|---|---|---|---|---|
4507 | 4 | 10369 | /aktuelt/arkiv/ | 04/04/2014 | category_names | CAT1, CAT2 |
4348 | 3 | 10373 | /publikasjoner/ | 23/05/2011 | category_names | CAT1, CAT2 |
4348 | 4 | 10374 | /publikasjoner/ | 23/05/2011 | category_names | CAT1, CAT2 |
4439 | 3 | 10701 | /publikasjoner/ | 08/04/2015 | category_names | CAT1, CAT2 |
4439 | 4 | 10702 | /publikasjoner/ | 08/04/2015 | category_names | CAT1, CAT2 |
I am trying to get rid of duplicates from the column "content_id", and I've searched for answers on stackoverflow, but I couldn't get it to work with my SQL query.
Can anyone help me with the current query?
SELECT db_name.contentversion.ContentId as 'content_id',
db_name.contentversion.Version as 'version',
db_name.contentversion.ContentVersionId as 'content_version_id',
db_name.content.alias as 'alias',
db_name.content.PublishDate as 'publish_date',
db_name.contentattributes.Name as 'contentattribute_name',
db_name.contentattributes.Value as 'contentattribute_value'
FROM db_name.contentversion
JOIN db_name.content
ON db_name.content.ContentId = db_name.contentversion.ContentId
JOIN db_name.contentattributes
ON db_name.contentattributes.ContentVersionId = db_name.contentversion.ContentVersionId
EDIT
I think I managed to solve it by using this query:
SELECT DISTINCT a.ContentId, a.Version, a.ContentVersionId,
db_name.content.alias as 'alias',
db_name.content.PublishDate as 'publish_date',
db_name.contentattributes.Name as 'contentattribute_name',
db_name.contentattributes.Value as 'contentattribute_value'
FROM contentversion a
INNER JOIN (
SELECT ContentId, MAX(Version) Version
FROM contentversion
GROUP BY ContentId
) b ON a.ContentId = b.ContentId AND a.Version = b.Version
JOIN db_name.content
ON db_name.content.ContentId = a.ContentId
JOIN db_name.contentattributes
ON db_name.contentattributes.ContentVersionId = a.ContentVersionId
CodePudding user response:
It seems like content_id is not a primary key and it can hold in two different versions ie 3 or 4. You may want to get the version you are interested in so that you filter it using WHERE version= ?
or just take distinct content_id
if the version type doesn't matter
CodePudding user response:
You have to omit version and content_version_id columns from the query as it has different values for the same content_id. Then, remove duplicate using group by clause as following.
SELECT db_name.contentversion.ContentId as 'content_id',
db_name.content.alias as 'alias',
db_name.content.PublishDate as 'publish_date',
db_name.contentattributes.Name as 'contentattribute_name',
db_name.contentattributes.Value as 'contentattribute_value'
FROM db_name.contentversion
JOIN db_name.content
ON db_name.content.ContentId = db_name.contentversion.ContentId
GROUP BY
db_name.contentversion.ContentId,
db_name.content.alias,
db_name.content.PublishDate,
db_name.contentattributes.Name,
db_name.contentattributes.Value;
CodePudding user response:
You want the maximum version per content ID. Probably the best approach for this is a lateral join, available since MySQL 8.0.14.
SELECT
cv.contentid,
cv.version,
cv.contentversionid,
c.alias,
c.publishdate AS publish_date,
ca.name AS contentattribute_name,
ca.value AS contentattribute_value
FROM db_name.content c
CROSS JOIN LATERAL
(
SELECT *
FROM contentversion
WHERE contentversion.contentid = c.contentid
ORDER BY contentversion.version DESC
LIMIT 1
) cv
INNER JOIN db_name.contentattributes ca ON ca.contentversionid = cv.contentversionid
ORDER BY cv.contentid, ca.name;