MySQL performance issue for concurrent users-CodePudding

When I am running a query on MySQL database, it is taking around 3 sec. When we execute the performance testing for 50 concurrent users, then the same query is taking 120 sec.

The query joins multiple tables with an order by clause and a limit condition.

We are using RDS instance (16 GB memory, 4 vCPU).

Can any one suggest how to improve the performance in this case?

Query:

SELECT 
    person0_.person_id AS person_i1_131_,
    person0_.uuid AS uuid2_131_,
    person0_.gender AS gender3_131_

    CASE
        WHEN
            EXISTS( SELECT * FROM patient p WHERE p.patient_id = person0_.person_id)
        THEN 1
        ELSE 0
    END AS formula1_,
    CASE
        WHEN person0_1_.patient_id IS NOT NULL THEN 1
        WHEN person0_.person_id IS NOT NULL THEN 0
    END AS clazz_
FROM
    person person0_
        LEFT OUTER JOIN
    patient person0_1_ ON person0_.person_id = person0_1_.patient_id
        INNER JOIN
    person_attribute attributes1_ ON person0_.person_id = attributes1_.person_id
        CROSS JOIN
    person_attribute_type personattr2_
WHERE
    attributes1_.person_attribute_type_id = personattr2_.person_attribute_type_id
        AND personattr2_.name = 'PersonImageAttribute'
        AND (person0_.person_id IN (SELECT 
            person3_.person_id
        FROM
            person person3_
                INNER JOIN
            person_attribute attributes4_ ON person3_.person_id = attributes4_.person_id
                CROSS JOIN
            person_attribute_type personattr5_
        WHERE
            attributes4_.person_attribute_type_id = personattr5_.person_attribute_type_id
                AND personattr5_.name = 'LocationAttribute'
                AND (attributes4_.value IN ('d31fe20e-6736-42ff-a3ed-b3e622e80842'))))
ORDER BY person0_1_.date_changed , person0_1_.patient_id
LIMIT 25

Plan

CodePudding user response：

There appears to be some redundant query components, and what does not appear to be a proper context of CROSSS-JOIN when you have relation on specific patient and/or attribute info.

Your query getting the "clazz_" is based on a patient_id NOT NULL, but then again a person_id not null. Under what condition, would the person_id coming from the person table EVER be null. That sounds like a KEY ID and would NEVER be null, so why test for that. It seems like that is a duplicate field and in-essence is just the condition of a person actually being a patient vs not.

This query SHOULD get the same results otherwise and suggest the following specific indexes are available including

table                  index
person                 ( person_id )
person_attribute       ( person_id, person_attribute_type_id )
person_attribute_type  ( person_attribute_type_id, name )
patient                ( patient_id )


select
        p1.person_id AS person_i1_131_,
        p1.uuid AS uuid2_131_,
        p1.gender AS gender3_131_,
        CASE WHEN p2.patient_id IS NULL
            then 0 else 1 end formula1_,
        -- appears to be a redunant result, just trying to qualify
        -- some specific column value for later calculations.
        CASE WHEN p2.patient_id IS NULL 
            THEN 0 else 1 end clazz_
    from
        -- pre-get only those people based on the P4 attribute in question
        -- and attribute type of location.  Get small list vs everything else
        ( SELECT distinct
                pa.person_id
            FROM
                person_attribute pa
                    JOIN person_attribute_type pat
                        on pa.person_attribute_type_id = pat.person_attribute_type_id
                        AND pat.name = 'LocationAttribute'
            WHERE
                pa.value = 'd31fe20e-6736-42ff-a3ed-b3e622e80842' ) PQ

            join person p1
                on PQ.person_id = p1.person_id

                LEFT JOIN patient p2 
                    ON p1.person_id = p2.patient_id

                JOIN person_attribute pa1 
                    ON p1.person_id = pa1.person_id
                    JOIN person_attribute_type pat1
                        on pa1.person_attribute_type_id = pat1.person_attribute_type_id
                        AND pat1.name = 'PersonImageAttribute'
    order by
        p2.date_changed, 
        p2.patient_id
    LIMIT 
        25

Finally, your query does an order by the date_changed and patient id which is based on the PATIENT table data having been changed. If that table is a left-join, you may have a bunch of PERSON records that are not patients and thus may not get the expected records you really intent. So, just some personal review of what is presented in the question.

CodePudding user response：

Speeding up the query is the best hope for handling more connections.

A simplification (but no speed difference), since TRUE=1 and FALSE=0:

CASE WHERE (boolean_expression) THEN 1 ELSE 0 END

-->

(boolean_expression)

Index suggestions:

person:  INDEX(patient_id, date_changed)
person_attribute:  INDEX(person_attribute_type_id, person_id)
person_attribute:  INDEX(person_attribute_type_id, value, person_id)
person_attribute_type:  INDEX(person_attribute_type_id, name)

If value is of type TEXT, then that cannot be used in an index.

Assuming that person has PRIMARY KEY(person_id) and patient -- patient_id, I have no extra recommendations for them.

The Entity-Attribute-Value schema pattern, which this seems to be, is hard to optimize when there are a large number of rows. Sorry.

The CROSS JOIN seems to be just an INNER JOIN, but with the condition in the WHERE instead of in ON, where it belongs.

person0_1_.patient_id can be NULL because of the LEFT JOIN, but I don't see how person0_.person_id can be NULL. Please check your logic.