When I am running a query on MySQL database, it is taking around 3 sec. When we execute the performance testing for 50 concurrent users, then the same query is taking 120 sec.
The query joins multiple tables with an order by clause and a limit condition.
We are using RDS instance (16 GB memory, 4 vCPU).
Can any one suggest how to improve the performance in this case?
Query:
SELECT
person0_.person_id AS person_i1_131_,
person0_.uuid AS uuid2_131_,
person0_.gender AS gender3_131_
CASE
WHEN
EXISTS( SELECT * FROM patient p WHERE p.patient_id = person0_.person_id)
THEN 1
ELSE 0
END AS formula1_,
CASE
WHEN person0_1_.patient_id IS NOT NULL THEN 1
WHEN person0_.person_id IS NOT NULL THEN 0
END AS clazz_
FROM
person person0_
LEFT OUTER JOIN
patient person0_1_ ON person0_.person_id = person0_1_.patient_id
INNER JOIN
person_attribute attributes1_ ON person0_.person_id = attributes1_.person_id
CROSS JOIN
person_attribute_type personattr2_
WHERE
attributes1_.person_attribute_type_id = personattr2_.person_attribute_type_id
AND personattr2_.name = 'PersonImageAttribute'
AND (person0_.person_id IN (SELECT
person3_.person_id
FROM
person person3_
INNER JOIN
person_attribute attributes4_ ON person3_.person_id = attributes4_.person_id
CROSS JOIN
person_attribute_type personattr5_
WHERE
attributes4_.person_attribute_type_id = personattr5_.person_attribute_type_id
AND personattr5_.name = 'LocationAttribute'
AND (attributes4_.value IN ('d31fe20e-6736-42ff-a3ed-b3e622e80842'))))
ORDER BY person0_1_.date_changed , person0_1_.patient_id
LIMIT 25
CodePudding user response:
There appears to be some redundant query components, and what does not appear to be a proper context of CROSSS-JOIN when you have relation on specific patient and/or attribute info.
Your query getting the "clazz_" is based on a patient_id NOT NULL, but then again a person_id not null. Under what condition, would the person_id coming from the person table EVER be null. That sounds like a KEY ID and would NEVER be null, so why test for that. It seems like that is a duplicate field and in-essence is just the condition of a person actually being a patient vs not.
This query SHOULD get the same results otherwise and suggest the following specific indexes are available including
table index
person ( person_id )
person_attribute ( person_id, person_attribute_type_id )
person_attribute_type ( person_attribute_type_id, name )
patient ( patient_id )
select
p1.person_id AS person_i1_131_,
p1.uuid AS uuid2_131_,
p1.gender AS gender3_131_,
CASE WHEN p2.patient_id IS NULL
then 0 else 1 end formula1_,
-- appears to be a redunant result, just trying to qualify
-- some specific column value for later calculations.
CASE WHEN p2.patient_id IS NULL
THEN 0 else 1 end clazz_
from
-- pre-get only those people based on the P4 attribute in question
-- and attribute type of location. Get small list vs everything else
( SELECT distinct
pa.person_id
FROM
person_attribute pa
JOIN person_attribute_type pat
on pa.person_attribute_type_id = pat.person_attribute_type_id
AND pat.name = 'LocationAttribute'
WHERE
pa.value = 'd31fe20e-6736-42ff-a3ed-b3e622e80842' ) PQ
join person p1
on PQ.person_id = p1.person_id
LEFT JOIN patient p2
ON p1.person_id = p2.patient_id
JOIN person_attribute pa1
ON p1.person_id = pa1.person_id
JOIN person_attribute_type pat1
on pa1.person_attribute_type_id = pat1.person_attribute_type_id
AND pat1.name = 'PersonImageAttribute'
order by
p2.date_changed,
p2.patient_id
LIMIT
25
Finally, your query does an order by the date_changed and patient id which is based on the PATIENT table data having been changed. If that table is a left-join, you may have a bunch of PERSON records that are not patients and thus may not get the expected records you really intent. So, just some personal review of what is presented in the question.
CodePudding user response:
Speeding up the query is the best hope for handling more connections.
A simplification (but no speed difference), since TRUE=1 and FALSE=0:
CASE WHERE (boolean_expression) THEN 1 ELSE 0 END
-->
(boolean_expression)
Index suggestions:
person: INDEX(patient_id, date_changed)
person_attribute: INDEX(person_attribute_type_id, person_id)
person_attribute: INDEX(person_attribute_type_id, value, person_id)
person_attribute_type: INDEX(person_attribute_type_id, name)
If value
is of type TEXT
, then that cannot be used in an index.
Assuming that person
has PRIMARY KEY(person_id)
and patient
-- patient_id
, I have no extra recommendations for them.
The Entity-Attribute-Value schema pattern, which this seems to be, is hard to optimize when there are a large number of rows. Sorry.
The CROSS JOIN
seems to be just an INNER JOIN
, but with the condition in the WHERE
instead of in ON
, where it belongs.
person0_1_.patient_id
can be NULL
because of the LEFT JOIN
, but I don't see how person0_.person_id
can be NULL
. Please check your logic.