Home > Software engineering >  Does PostgreSQL's Statistics Collector track *all* usage of indexes?
Does PostgreSQL's Statistics Collector track *all* usage of indexes?

Time:11-18

After taking over the DBA duties on a fairly complex database, I wanted to eliminate any indexes that are consuming substantial disk space, but not being used. I ran the following, to identify unused indexes, sorting to prioritize those that consume the most space on disk:

SELECT 
    schemaname,
    pg_stat_all_indexes.relname AS table,
    pg_class.relname AS index,
    pg_total_relation_size(oid) AS size,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM  pg_class
JOIN pg_stat_all_indexes ON pg_stat_all_indexes.indexrelname = pg_class.relname
WHERE  
    relkind =('i')
ORDER BY size DESC

I was a little surprised at just how many large indexes appear not to be used, at all -- as evidenced by a 0 for the idx_scan column. Some of these apparently-unused indexes include a function call that does something pretty specific (as in the contrived example below), and appear to have been set up to assist with API functionality.

--not real index
CREATE INDEX foo_transform_foo_name_idx
    ON foo USING btree
    (foo_transform_name(foo_name));

My question, here, is whether the Statistics Collector captures all uses of a particular index, even if those indexes were scanned from a SQL-language function, or in some other way?

CodePudding user response:

These indexes have never been scanned. However, there are some other uses for indexes:

  • they enforce uniqueness and other constraints

  • they make ANALYZE gather statistics on indexed expressions

Use this query from my blog to find the indexes that you can drop without any negative consequences:

SELECT s.schemaname,
       s.relname AS tablename,
       s.indexrelname AS indexname,
       pg_relation_size(s.indexrelid) AS index_size
FROM pg_catalog.pg_stat_user_indexes s
   JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid
WHERE s.idx_scan = 0      -- has never been scanned
  AND 0 <>ALL (i.indkey)  -- no index column is an expression
  AND NOT i.indisunique   -- is not a UNIQUE index
  AND NOT EXISTS          -- does not enforce a constraint
         (SELECT 1 FROM pg_catalog.pg_constraint c
          WHERE c.conindid = s.indexrelid)
ORDER BY pg_relation_size(s.indexrelid) DESC;
  • Related