Basically I just want a simple way of finding the most recent date in a table, saving it as a variable, and reusing that variable in the same query.
Right now this is how I'm doing it:
with recent_date as (
select max(date)
from mytable
)
select *
from mytable
where date = (select * from recent_date)
(For this simple example, a variable is overkill, but in my real-world use-case I reuse the recent date multiple times in the same query.)
But that feels cumbersome. It would be a lot cleaner to save the recent date to a variable rather than a table and having to select from it.
In pseudo-code, something like this would be nice:
$recent_date = (select max(date) from mytable)
select *
from mytable
where date = $recent_date
Is there something like that in Postgres?
CodePudding user response:
Keep in mind that in SQL you cannot directly declare a variable. Basically a CTE is creating variable (or a set of) and in SQL to use a variable you select it. However, if you want to avoid that structure you can just get the variable directl from a subset directly.
select *
from mytable
where date = (select max(date) from mytable);
CodePudding user response:
Better for the simple case
For the scope of a single query, CTEs are a good tool. In my hands the query would look like this:
WITH recent(date) AS (SELECT max(date) FROM mytable)
SELECT m.*
FROM recent r
JOIN mytable m USING (date)
Except that the actual example query would burn down to this in my hands:
SELECT *
FROM mytable
ORDER BY date DESC NULLS LAST
FETCH FIRST 1 ROWS WITH TIES;
NULLS LAST
only if there can be NULL values. See:
WITH TIES
only if date isn't UNIQUE NOT NULL
. See:
In combination with an index on mytable (date)
(or more specific), this produces the best possible query plan. Look no further.
No, I need variables!
If you positively need variables scoped for the same command, transaction, session or more, there are various options.
The closest thing to "variables" in SQL in Postgres are "customized options". See:
You can only store text
, any other type has to be cast (and cast back on retrieval).
To set and retrieve a value from within a query, use the Configuration Settings Functions set_config()
and current_setting()
:
SELECT set_config('foo.recent', max(date)::text, false) FROM mytable;
SELECT *
FROM mytable
WHERE date = current_setting('foo.recent')::date;
Typically, there are more efficient ways.
If you need that "recent date" a lot, consider a simple function as "global variable", usable by all transactions in all sessions (but each new command sees its own current state):
CREATE FUNCTION f_recent_date()
RETURNS date
LANGUAGE sql STABLE PARALLEL SAFE AS
'SELECT max(date) FROM mytable';
STABLE
is a valid volatility setting as the function returns the same result within the same query. Be sure to actually make it STABLE
, so Postgres does not evaluate repeatedly. In Postgres 9.6 or later, also make it PARALLEL SAFE
. Then your query becomes:
SELECT * FROM mytable WHERE date = f_recent_date();
More options:
- Is there a way to define a named constant in a PostgreSQL query?
- Passing user id to PostgreSQL triggers
Typically, if I need variables in Postgres, I use a PL/pgSQL code block in a function, a procedure, or a DO
statement for ad-hoc use without the need to return rows:
DO
$do$
DECLARE
_recent_date date := (SELECT max(date) FROM mytable);
BEGIN
PERFORM * FROM mytable WHERE date = _recent_date;
-- more queries using _recent_date ...
END
$do$;
PL/pgSQL may be what you should be using to begin with. Further reading: