I have 100000 products and 500 categories. every product is in 10 categories. products and categories are in many to many relation. i want to find 10 most related products by category.
for example product
P1 is in 1,2,3,4,5 categories.
P2 is in 4,5,6,7,8 categories.
P3 is in 2,3,4,5,6 categories.
the most related product to P1 is P3.because they are common in 4 categories. i need a query to get 10 most related product to a specific product.
is this query costly in performance?
CodePudding user response:
Query would be:
SELECT pother.id, count(*) as common
FROM product
JOIN category ON product.category = category.id
JOIN product pother ON pother.category = category.id
WHERE product.id = X
GROUP BY pother.id
ORDER BY common DEC LIMIT 10
Provided the product.category
is indexed and you've limited a product to 10 categories it might not be too bad. There should be a optimizer path for that.
Alternate subquery form
SELECT id, count(*) as common
FROM product
JOIN category
ON product.category = category.id AND category.id IN (select category FROM product WHERE product.id = X)
GROUP BY id
ORDER BY comm DESC LIMIT 10
Can you edit the query plan into this answer?