I have two tables 'A' and 'B'. Table A has a field that is comma-separated values as follows : Table A (fields : project_name, model_types)
|project_name | | model_types |
project_animals detection,segmentation,detection,classification
I have table B with some information related to each model type listed above.
Table B (fields : model,labels,image_types)
| model | labels | image_types |
detection | cat,dog | jpg,png
segmentation | rat,dog | jpg,tif
classification| cow,cat | bmp,png
I need to read the labels and image_types for each model type listed in table A with a comma separate string. (no need to find unique)
Using the following SQL script, I could get the model_types string
select model_types from A where project_name = 'project_animals'
This will return model_types = 'detection,segmentation,detection,classification'.
So instead of reading table B with each item separately splited (model_types.split(',') outside mySQL and read again, how could I do it once in mySQL script.
So I need the results as follows from a single mySQL statement :
Model_types | labels | image_types
detection cat,dog jpg,png
segmentation rot,dog jpg,tif
detection cat,dog jpg,png
classification cow,cat bmp,png
Is it even possible?
CodePudding user response:
Yes, it is possible, but you probably won't like the result. It's not possible to optimize searches on comma-separated strings (or any other substring matching, or regular expressions, etc.). So the query is bound to do a table-scan to find matches rows in table B.
SELECT tableB.*
FROM tableA
JOIN tableB ON FIND_IN_SET(tableB.model, tableA.model_types)
WHERE project_name = 'project_animals';
The FIND_IN_SET() expression can't use an index. Also it won't work if your comma-separated list contains spaces.
The proper way to store and query multi-valued attributes is to create another child table, and store multiple rows, with one model value per row.
For example, you could create a table project_models
:
project_name | model |
---|---|
project_animals | detection |
project_animals | segmentation |
project_animals | detection |
project_animals | classification |
Then join this way:
SELECT tableB.*
FROM project_models
JOIN tableB ON tableB.model = project_models.model
WHERE project_models.project_name = 'project_animals';
This can use an index on tableB.model
to optimize the join.
Besides this optimization problem, using a comma-separated list causes numerous other problems. See my answer to Is storing a delimited list in a database column really that bad?