I am trying to build an SQL query on an omekaS database to extract books with a specific title and date.
In this system, the table holding the values is called value
, and the relevant attributes are as so :
value
-id (pk)
-resource_id
-type (int)
-value
The resource_Id
is the identifier through the db
Value is one field; like the book "Moby Dick" written in 1822 (random date) would look like this in db. Type tells us what is it (arbitrarily, 1
is titles and 2
is publishing year)
Id | resource_id | type | value |
---|---|---|---|
1 | 1 | 1 | Moby dick |
2 | 1 | 2 | 1822 |
I need to take all books with a certain title, published in a set of years
Basically
Select all the columns from value
Where value is a title and of type title
And value is in a set of years
And with all these having the same resource_id
I get lost, and I have no idea how to do this.
What would be the best way?
Thanks
CodePudding user response:
If you need to check more types, you can check with AND EXISTS (SELECT 1 FROM ...WHERE type = 3 AND ...)
if the conditions are met
I renemd the table name, it is confusing to use cloumn names and table names with identical names.
CREATE TABLE books ( `Id` INTEGER, `resource_id` INTEGER, `type` INTEGER, `value` VARCHAR(20) ); INSERT INTO books (`Id`, `resource_id`, `type`, `value`) VALUES ('1', '1', '1', 'Moby Dick'), ('2', '1', '2', '1822');
SELECT `resource_id`, `value` FROM books WHERE `type` = 1 AND `resource_id` IN (SELECT `resource_id` FROM books WHERE `type`= 2 AND `value` IN ('1822','1984'))
resource_id | value ----------: | :-------- 1 | Moby Dick
db<>fiddle here
And if you want to have them in a row
SELECT b1.`resource_id`, b1.`value` , b2.`value` FROM books b1 INNER JOIN books b2 ON b1.`resource_id` = b2.`resource_id` and b2.`type`= 2 WHERE b1. `type` = 1 ANd b1.`resource_id` IN (SELECT `resource_id` FROM books WHERE `type`= 2 AND `value` IN ('1822','1984'))
resource_id | value | value ----------: | :-------- | :---- 1 | Moby Dick | 1822
db<>fiddle here
CodePudding user response:
should be a simple self-join. The where clause of one table has common ID to the second instance on the second filtering criteria.
select
v1.resource_id,
v1.value Title,
v2.value YearPublished
from
Value v1
Join Value v2
on v2.type = 2
AND v2.value in ( 1822, 1835, 1912 )
AND v1.resource_id = v2.resource_id
where
v1.type = 1
v1.value = 'Moby Dick'
I would have a composite index on your table on (type, value, resource_id) instead of individual indexes on each column
Now lets look at what this is doing. If you look at the very BASIC part of the query as it only pertains to the "v1" alias, it is looking for the type = 1 (representing the title), AND the value is the name of the book of interest. You probably get that easy enough.
So now all you are doing is adding an additional criteria for a year written/published (or whatever your other "types" of data indicate). This becomes the join to the same table AGAIN, but with a different alias. Now, for the relationship, just put THAT criteria there, but IN ADDITION, you are also making sure the resource_id is the same as in the first v1 resource found.
The benefit here, is your outer primary query will start the data looking FIRST for only those that qualify for the title which would be a somewhat small list. And then, only from that small list does it even care if there is a join condition to the second table by the same resource AND the additional query you are looking for (the list of dates, years, or whatever).
Now if you had some other "value" with a type of 3, but did not care what values were in that, you would just join to the value table again as ex: alias v3, but only do the join on the v3.type = 3 and v1.resource_id = v3.resource_id
By using the alias references, and the value column, just assign the final column alias name to what it is supposed to represent as sampled here.
If doing such a possible 3rd, 4th, or more to the same value table, you probably want an additional index on (resource_id, type, value). Yes, two indexes with same columns, but to optimize the 2nd, 3rd, 4th JOIN condition purposes.