I have the following table
CREATE TABLE "shots" (
"player" INTEGER,
"tournament" TEXT,
"year" INTEGER,
"course" INTEGER,
"round" INTEGER,
"hole" INTEGER,
"shot" INTEGER,
"text" TEXT,
"distance" REAL,
"x" TEXT,
"y" TEXT,
"z" TEXT
);
With a sample of the data:
28237 470 2015 717 1 1 1 Shot 1 302 yds to left fairway, 257 yds to hole 10874 11451.596 10623.774 78.251
28237 470 2015 717 1 1 2 Shot 2 234 yds to right fairway, 71 ft to hole 8437 12150.454 10700.381 86.035
28237 470 2015 717 1 1 3 Shot 3 70 ft to green, 4 ft to hole 838 12215.728 10725.134 88.408
28237 470 2015 717 1 1 4 Shot 4 in the hole 46 12215.1 10729.1 88.371
28237 470 2015 717 1 2 1 Shot 1 199 yds to green, 29 ft to hole 7162 12776.03 10398.086 91.017
28237 470 2015 717 1 2 2 Shot 2 putt 26 ft 7 in., 2 ft 4 in. to hole 319 12749.444 10398.854 90.998
28237 470 2015 717 1 2 3 Shot 3 in the hole 28 12747.3 10397.6 91.027
28237 470 2015 717 1 3 1 Shot 1 296 yds to left intermediate, 204 yds to hole 10651 12596.857 9448.27 94.296
28237 470 2015 717 1 3 2 Shot 2 208 yds to green, 15 ft to hole 7478 12571.0 8825.648 94.673
28237 470 2015 717 1 3 3 Shot 3 putt 17 ft 6 in., 2 ft 5 in. to hole 210 12561.831 8840.539 94.362
I want to get for each shot
the previous location (x
, y
, z
). I wrote the below query.
SELECT cur.player, cur.tournament, cur.year, cur.course, cur.round, cur.hole, cur.shot, cur.x, cur.y, cur.z, prev.x, prev.y, prev.z
FROM shots cur
INNER JOIN shots prev
ON (cur.player, cur.tournament, cur.year, cur.course, cur.round, cur.hole, cur.shot) =
(prev.player, prev.tournament, prev.year, prev.course, prev.round, prev.hole, prev.shot - 1)
This query takes forever basically. How can I rewrite it to make it faster?
In addition, I need to make an adjustment for the first shot on a hole (shot
= 1). This shot is made from tee_x
, tee_y
and tee_z
. These values are available in table holes
CREATE TABLE "holes" (
"tournament" TEXT,
"year" INTEGER,
"course" INTEGER,
"round" INTEGER,
"hole" INTEGER,
"tee_x" TEXT,
"tee_y" TEXT,
"tee_z" TEXT
);
With data:
470 2015 717 1 1 11450 10625 78.25
470 2015 717 1 2 12750 10400 91
470 2015 717 1 3 2565 8840.5 95
Thanks
CodePudding user response:
First, you need a composite index to speed up the operation:
CREATE INDEX idx_shots ON shots (player, tournament, year, course, round, hole, shot);
With that index, your query should run faster:
SELECT cur.player, cur.tournament, cur.year, cur.course, cur.round, cur.hole, cur.shot, cur.x, cur.y, cur.z,
prev.x AS prev_x, prev.y AS prev_y, prev.z AS prev_z
FROM shots cur LEFT JOIN shots prev
ON (cur.player, cur.tournament, cur.year, cur.course, cur.round, cur.hole, cur.shot) =
(prev.player, prev.tournament, prev.year, prev.course, prev.round, prev.hole, prev.shot 1);
The changes I made:
- the join should be a
LEFT
join so that all rows are included and not only the ones that have a previous row -1
should be1
because the previous row's shot is 1 less than the current row's shot- added aliases for the previous row's
x
,y
andz
But, if your version of SQLite is 3.25.0 it would be better to use window function LAG()
instead of a self join:
SELECT *,
LAG(x) OVER w AS prev_x,
LAG(y) OVER w AS prev_y,
LAG(z) OVER w AS prev_z
FROM shots
WINDOW w AS (PARTITION BY player, tournament, year, course, round, hole ORDER BY shot);
See the demo (I include the query plan for both queries where you can see the use of the composite index).