Home > Software design >  MySQL MEMBER OF ultra slow; alternatives for using JSON array values in queries?
MySQL MEMBER OF ultra slow; alternatives for using JSON array values in queries?

Time:11-12

We have a system take takes a JSON array as a parameter to define matching records. To keep things simple, we have a query:

SELECT ai.ID 
FROM association_internal ai 
WHERE ai.source_object_record_id MEMBER OF('[57928,57927]');

Now the above call takes 380ms, which is ridiculous. Why? Because the below version of the exact same call takes 11ms.

SELECT ai.ID 
FROM association_internal ai 
WHERE ai.source_object_record_id IN(57928,57927);

Just by using IN vs. MEMBER OF, it works fine. MEMBER OF is so slow, it's (honestly) unusable in an enterprise setting. The speed degrades the more numbers we have (and we can have 100 in the above example).

Obviously there's some crazy slow processing of MEMBER OF (perhaps it's even avoiding indexing?) but we cannot pass a JSON array as a property of IN. Now I COULD write dynamic SQL and execute that manually, but that's ugly and sloppy.

So the question is this - is there a way to pass a JSON array for use in a hardcoded query that doesn't use MEMBER OF? Something that'll be just as fast as hardcoding the IN values? Or a way to convert a JSON array of values into a useable IN or EXISTS use case?

CodePudding user response:

I can't believe it. Using JSON_TABLE only takes 12ms to convert the array. So I can front-load any array as a CTE and use IN and it's lightning fast.

For clarity, the fix is:

WITH frontload_array AS
(
  SELECT jt.ID 
  FROM JSON_TABLE('[1,2,3,4,5]', '$[*]' COLUMNS(ID int PATH '$')) AS jt
)
SELECT ai.ID 
FROM association_internal ai 
WHERE ai.source_object_record_id IN(SELECT ID FROM frontload_array);

CodePudding user response:

The right way to optimize this query is to use IN() with a discrete value for each element in your array. You can't optimize a comparison against a comma-separated string.

SELECT ai.ID 
FROM association_internal ai 
WHERE ai.source_object_record_id IN(?, ?, ?, ...);

Make as many ? placeholders as you have elements in your array, and bind each element as a scalar to each parameter.

This is not ugly or sloppy, this is the proper way to use SQL. When you write functions in Java or other language, do you pass arguments as a comma-separated string and expect the function to split that string? No — you would pass a separate value per argument.

CodePudding user response:

JSON is a string; it has to be parsed to see what is in it. And, in most cases, the query must look at every row; no shortcuts.

An RDBMS is a structure that is optimized to fetch things. This is especially true when INDEXes are used.

Indexes into JSON are minimal, and you have to work to get them -- such as with "generated" columns.

Use JSON for stuff that you won't need in WHERE, GROUP BY or ORDER BY.

  • Related