I have a table that looks like this:
CREATE TABLE UTable (
m_id TEXT PRIMARY KEY,
u1 TEXT,
u2 TEXT,
u3 TEXT,
-- other stuff, as well as
gid INTEGER,
gt TEXT,
d TEXT,
timestamp TIMESTAMP
);
CREATE TABLE OTable (
gid INTEGER,
gt TEXT,
d TEXT,
-- other stuff, such as
n INTEGER
);
CREATE UNIQUE INDEX OTable_idx ON OTable (gid, gt, d);
For each record in OTable
that matches a condition (fixed values of gid, gt
), I want to join the corresponding record in UTable
with the minimum timestamp.
What's catching me is that in my final result I don't care about the timestamp, I clearly need to group on d
(since gid
and gt
are fixed), and yet I do need to extract u1, u2, u3
from the selected record.
SELECT o.d, u.u1, u.u2, u.u3, o.n
FROM UTable u
INNER JOIN OTable o
ON u.gid = o.gid AND u.gt = o.gt AND u.d = o.d
WHERE u.gid = 3 AND u.gt = 'dog night'
GROUP BY u.d
-- and u.timestamp is the minimum for each group
;
I think my first step should be just to do the select on UTable
and then I can join against that. But even there I'm a bit confused.
SELECT u.d, u.u1, u.u2, u.u3
FROM UTable u
WHERE u.gid = 3 AND u.gt = 'dog night';
I want to add HAVING MIN(u.timestamp)
, but that's not valid.
Any pointers as to what I need to do?
I did see this question, but it isn't quite what I need, since I can't group on all the UTable
values lest I select too many things.
CodePudding user response:
GROUP BY u.d
(without also listing u1
, u2
, u3
) would only work if u.d
was the PRIMARY KEY
(which it is not, and also wouldn't make sense in your scenario). See:
I suggest DISTINCT ON
in a subquery on UTable
instead:
SELECT o.d, u.u1, u.u2, u.u3, o.n
FROM (
SELECT DISTINCT ON (u.d)
u.d, u.u1, u.u2, u.u3
FROM UTable u
WHERE u.gid = 3
AND u.gt = 'dog night'
ORDER BY u.d, u.timestamp
) u
JOIN OTable o USING (gid, gt, d);
See:
If UTable
is big, at least a multicolumn index on (gid, gt)
is advisable. Same for OTable
.
Maybe even on (gid, gt, d)
. Depends on data types, cardinalities, ...