I have a SQL Server database that stores recipes and in this database there is a view that returns
ingredient_name, ingredient_plural_name, recipe_id, recipe_name, recipe_description, tag_name
(among other information not pertinent to this question).
The way the database is set up, this view will return a new row for each combination of ingredient and tag in the recipe (my test entry has 7 ingredients and 2 tags, so it returns 14 rows for this recipe) like this:
ingredient_name | ingredient_plural_name | tag_name | etc. |
---|---|---|---|
all purpose flour | NULL | dessert | The rest of the data is the same in all records |
all purpose flour | NULL | simple | |
baking powder | NULL | dessert | |
baking powder | NULL | simple | |
egg | eggs | dessert | |
egg | eggs | simple | |
milk | NULL | dessert | |
milk | NULL | simple | |
unsalted butter | NULL | dessert | |
unsalted butter | NULL | simple | |
vanilla extract | NULL | dessert | |
vanilla extract | NULL | simple | |
white sugar | NULL | dessert | |
white sugar | NULL | simple |
I want to write a stored procedure that returns the recipe_name
, recipe_description
, and recipe_id
only if the tags and/or name/plural name contain all values provided as a parameter input.
The recipe can have more tags and ingredients than the ones specified, but it must contain all of the ones specified. For example, it should return this sample recipe if I do
EXEC search_include @includeTags= 'milk, eggs'
but NOT if I do
EXEC search_include @includeTags= 'milk, eggs, cheese'
Am I missing something, or is this going to be a bit convoluted?
I currently have this code:
PROCEDURE [dbo].[search_include]
(@includeTags varchar(max))
AS
BEGIN
SELECT DISTINCT
recipe_name, recipe_description, recipe_id
FROM
recipe_view
WHERE
tag_name IN (SELECT VALUE FROM dbo.split(@includeTags))
OR ingredient_name IN (SELECT VALUE FROM dbo.split(@includeTags))
OR ingredient_plural_name IN (SELECT VALUE FROM dbo.split(@includeTags))
END
And it works to only show results where tags and ingredients are included in the list, but I need it to only return results of recipes that match ALL of the inputs.
I have also tried this approach:
BEGIN
CREATE TABLE #tags (tag varchar(100))
INSERT INTO #tags
SELECT VALUE FROM dbo.split(@includeTags)
SELECT recipe_id
FROM recipe_view
WHERE tag_name IN (SELECT tag FROM #tags)
OR ingredient_name IN (SELECT tag FROM #tags)
OR ingredient_plural_name IN (SELECT tag FROM #tags)
GROUP BY recipe_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM #tags)
DROP TABLE #tags
END
From some similar (but not quite the same) questions from this site. This one gives me no results whatsoever, no matter what I type for the parameters.
The dbo.split
function just takes a comma separated list as a varchar
string and returns a pseudo table of each item in a new row in a "VALUE" column.
I'm not great with SQL by any means, and I'm having some difficulty wrapping my head around getting this right.
CodePudding user response:
I think the issue is with the view that creates multiple rows for the same ingredient. For example, if you had the two tags and just did a SELECT with the above e.g.,
SELECT *
FROM #recipe_view
WHERE tag_name IN (SELECT tag FROM #tags)
OR ingredient_name IN (SELECT tag FROM #tags)
OR ingredient_plural_name IN (SELECT tag FROM #tags);
It will show 4 rows
ingredient_name ingredient_plural_name tag_name recipe_id
egg eggs dessert cake
egg eggs simple cake
milk null dessert cake
milk null simple cake
Therefore COUNT(*)
will return 4 - and not match the HAVING clause (which is expecting a count of 2).
I suggest
- Instead of having the 'recipe_view' look like that, instead you combine all relevant ingredients/tags etc into a single column
- Do a matching similar to your answer above.
Here is a db<>fiddle that starts with the recipe_view, converts it to a single column to check (recipe_tag
) then does the check.
I expect you have a better way of doing this (given that the view is already selecting data from elsewhere) but the 'view' I use (actually a temporary table called recipe_tags
) has the following data/structure - with all the tags/etc relevant to the recipe.
recipe_id recipe_tag
cake all purpose flour
cake baking powder
cake dessert
cake egg
cake eggs
cake milk
cake simple
cake unsalted butter
cake vanilla extract
cake white sugar
Then you can do a similar approach as you have above against this table
SELECT recipe_id
FROM #recipe_tags
INNER JOIN #tags ON #recipe_tags.recipe_tag = #tags.tag
GROUP BY recipe_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM #tags);
This has the expected output - the recipe_id (in this case 'cake').
CodePudding user response:
What you want is a query that: Given a distinct list of recipes, select each recipe where all tags are present in the ingredients for that recipe.
The "tags are present in the ingredients" is logically equivalent to "where none of the tags are missing from the ingredients". This can be achieved with a nested WHERE NOT EXISTS()
condition.
Something like:
SELECT R.*
FROM (
SELECT DISTINCT recipe_name, recipe_description, recipe_id
FROM recipe_view R
) R
WHERE NOT EXISTS(
SELECT *
FROM #tags T
WHERE NOT EXISTS(
SELECT *
FROM recipe_view I
WHERE I.recipe_id = R.recipe_id
AND (
T.tag = I.tag_name
OR T.tag = I.ingredient_name
OR T.tag = I.ingredient_plural_name
)
)
)
See this db<>fiddle.
Using the view repeatedly in this query can be inefficient. If you hae access to the underlying tables, it may be wise to rewrite the above to access those tables directly.
CodePudding user response:
One option to achieve this functionality is to union all of the three columns that are required to match with, then use the distinct count to check if the three columns have all of the tag values.
CREATE PROCEDURE search_include
@includeTags varchar(128)
AS
BEGIN
CREATE TABLE #tags (tag varchar(100));
INSERT INTO #tags
SELECT VALUE FROM STRING_SPLIT(REPLACE(@includeTags, ', ', ','), ',');
with rv as
(
SELECT recipe_id, ingredient_name as attribute_value
FROM recipe_view WHERE ingredient_name IN (SELECT tag FROM #tags)
UNION ALL
SELECT recipe_id, ingredient_plural_name
FROM recipe_view WHERE ingredient_plural_name IN (SELECT tag FROM #tags)
UNION ALL
SELECT recipe_id, tag_name
FROM recipe_view WHERE tag_name IN (SELECT tag FROM #tags)
)
SELECT recipe_id
FROM rv
GROUP BY recipe_id
HAVING COUNT(distinct attribute_value) = (SELECT COUNT(*) FROM #tags);
DROP TABLE #tags;
END
The use of REPLACE('milk, dessert', ', ',',')
inside the split function is to remove the space that may exist in the parameter value, i.e.'milk, eggs' -> 'milk,eggs'.