Given a database table events
with columns: event_id
, correlation_id
, username
, create_timestamp
. It contains more than 1M records.
There is a problem I am trying to solve: for each event of a particular user display its latest sibling event. Sibling events are the events which has same correlation_id
. The query I use for that is the following:
SELECT
"events"."event_id" AS "event_id",
"latest"."event_id" AS "latest_event_id"
FROM
events "events"
JOIN (
SELECT
"latest"."correlation_id" AS "correlation_id",
"latest"."event_id" AS "event_id",
ROW_NUMBER () OVER (
PARTITION BY "latest"."correlation_id"
ORDER BY
"latest"."create_timestamp" ASC
) AS "rn"
FROM
events "latest"
) "latest" ON (
"latest"."correlation_id" = "events"."correlation_id"
AND "latest"."rn" = 1
)
WHERE
"events"."username" = 'user1'
It gets correct list of results but causes performance problems which must be fixed. Here is an execution plan of the query:
Hash Right Join (cost=13538.03..15522.72 rows=1612 width=64)
Hash Cond: (("latest".correlation_id)::text = ("events".correlation_id)::text)
-> Subquery Scan on "latest" (cost=12031.35..13981.87 rows=300 width=70)
Filter: ("latest".rn = 1)
-> WindowAgg (cost=12031.35..13231.67 rows=60016 width=86)
-> Sort (cost=12031.35..12181.39 rows=60016 width=78)
Sort Key: "latest_1".correlation_id, "latest_1".create_timestamp
-> Seq Scan on events "latest_1" (cost=0.00..7268.16 rows=60016 width=78)
-> Hash (cost=1486.53..1486.53 rows=1612 width=70)
-> Index Scan using events_username on events "events" (cost=0.41..1486.53 rows=1612 width=70)
Index Cond: ((username)::text = 'user1'::text)
From the plan, I can conclude that the performance problem mainly caused by calculation of latest events for ALL events in the table which takes ~80% of cost. Also it performs the calculations even if there are no events for a user at all. Ideally, I would like the query to do these steps which seem more efficient for me:
- find all events by a user
- for each event from Step 1, find all siblings, sort them and get the 1st
To simplify the discussion, let's consider all required indexes as already created for needed columns. It doesn't seem to me that the problem can be solved purely by index creation.
Any ideas what can be done to improve the performance? Probably, there are options to rewrite the query or adjust a configuration of the table.
Note that this question is significantly obfuscated in terms of business meaning to clearly demonstrate the technical problem I face.
CodePudding user response:
The window function has to scan the whole table. It has no idea that really you are only interested in the first value. A lateral join could perform better and is more readable anyway:
ELECT
e.event_id,
latest.latest_event_id
FROM
events AS e
CROSS JOIN LATERAL
(SELECT
l.event_id AS latest_event_id
FROM
events AS l
WHERE
l.correlation_id = e.correlation_id
ORDER BY l.create_timestamp
FETCH FIRST 1 ROWS ONLY
) AS latest;
The perfect index to support that would be
CREATE INDEX ON event (correlation_id, create_timestamp);
CodePudding user response:
One option that may improve efficiency is to rewrite the query filtering "rn" = 1 beforehand to reduce resulting rows when joining tables:
WITH "latestCte"("correlation_id", "event_id") as (SELECT
"correlation_id",
"event_id",
ROW_NUMBER () OVER (
PARTITION BY "correlation_id"
ORDER BY
"create_timestamp" ASC
) AS "rn"
FROM
events)
SELECT
"events"."event_id" AS "event_id",
"latest"."event_id" AS "latest_event_id"
FROM
events "events"
JOIN (
SELECT "correlation_id", "event_id" FROM "latestCte" WHERE "rn" = 1
) "latest" ON (
"latest"."correlation_id" = "events"."correlation_id"
)
WHERE
"events"."username" = 'user1'
Hope it helps, also I am curious to see the resulting execution plan of this query. Best regards.
CodePudding user response:
Without having access to the data, I'm really just throwing out ideas...
Instead of a subquery, it's worth trying a materialized CTE
Rather than the row_number analytic, you can try a
distinct on
. Honestly, I don't predict any gains. It's basically the same thing at the database level
Sample of both:
with latest as materialized (
SELECT distinct on ("correlation_id")
"correlation_id", "event_id"
FROM events
order by
"correlation_id", "create_timestamp" desc
)
SELECT
e."event_id",
l."event_id" AS "latest_event_id"
FROM
events "events" e
join latest l ON
l."correlation_id" = e."correlation_id"
WHERE
e."username" = 'user1'
Additional suggestion -- if you are doing this over and over, I'd consider creating a temp table or materialized view for "latest," including an index by coorelation_id rather than re-running the subquery (or CTE) every single time. This will be a one-time pain followed my repeated gain.
Yet one more suggestion -- if at all possible, drop the double quotes from your object names. Maybe it's just me, but I find them brutal. Unless you have spaces, reserved words or mandatory uppercase in your field names (please don't do that), then these create more problems than they solve. I kept them in the query I listed above, but it pained me.
And this last comment goes back to knowing your data... since row_number and distinct on are relatively expensive operations, it may make sense to make your subquery/cte more selective by introducing the "user1" constraint. This is completely untested, but something like this:
SELECT distinct on (e1.correlation_id)
e1.correlation_id, e1.event_id
FROM events e1
join events e2 on
e1.correlation_id = e2.correlation_id and
e2.username = 'user1'
order by
e1.correlation_id, e1.create_timestamp desc
CodePudding user response:
All those needless double quotes are making my eyes bleed.
This should very fast with a lateral join, provided the number of returned rows is rather low, i.e. 'user1' is rather specific.
explain analyze SELECT
events.event_id AS event_id,
latest.event_id AS latest_event_id
FROM
events "events"
cross JOIN lateral (
SELECT
latest.event_id AS event_id
FROM events latest
WHERE latest.correlation_id=events.correlation_id
ORDER by create_timestamp ASC limit 1
) latest
WHERE
events.username = 'user1';
You will want an index on username
, and one on (correlation_id, create_timestamp)
If the number of rows returned is large, then your current query, which precomputes in bulk, is probably better. But would be faster if you used DISTINCT ON rather than the window function to pull out just the latest for each correlation_id. Unfortunately the planner does not understand these queries to be equivalent, and so will not interconvert between them based on what it thinks will be faster.