Efficient reverse prefix search in Postgres-CodePudding

Assume a table of 100K Canadian postal code prefixes with population sizes:

Region code	Population
H2	10,000,000
H2Z	100,000
H2K	50,000
H2Z 1G9	500

Given a full length postal code, e.g. "H2Z 1G9" I need to return every row whose region code is a prefix of the input. For some reason I need to do this a lot, so I need effective indexing.

How should I go about this?

CodePudding user response：

Would something like this work efficiently? You can check explain analyze to ensure it uses the PK on your table:

with in_postal_code as (
  select distinct rtrim((left('H2Z 1G9', gs.n))) as postal_prefix
    from generate_series(1, 7) as gs(n)
)
select cp.*
  from in_postal_code i
       join postal_code_population cp
         on cp."Region Code" = i.postal_prefix;

CodePudding user response：

You could use a function like this:

CREATE TABLE tab (
   region text PRIMARY KEY,
   population bigint NOT NULL
);

CREATE FUNCTION find_match (t text) RETURNS SETOF tab
   LANGUAGE plpgsql AS
$$DECLARE
   s text;
BEGIN
   FOR s IN SELECT substr(t, 1, n)
            FROM generate_series(1, length(t)) AS s(n)
   LOOP
      RETURN QUERY SELECT * FROM tab
                   WHERE region = s;
   END LOOP;
END;$$;