Home > Enterprise >  Where is the explanation of pageinspect b-tree functions results from postgresql?
Where is the explanation of pageinspect b-tree functions results from postgresql?

Time:10-25

I am using postgresql page inspector and I would like to know, where can I find more detailed documentation for the outputs of its functions (inputs would be nice, too). I am only interested in b-tree functions.

The docs say for example:

bt_page_stats returns summary information about single pages of B-tree indexes. For example:

 test=# SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
-[ RECORD 1 ]- -----
blkno         | 1
type          | l
live_items    | 224
dead_items    | 0
avg_item_size | 16
page_size     | 8192
free_size     | 3668
btpo_prev     | 0
btpo_next     | 0
btpo_level    | 0
btpo_flags    | 3

Here I would like to know what those results actually mean - what is type, dead_items and so on. Where can I find it ?

CodePudding user response:

To understand these values, you have to understand the tree-structure of a B-tree index.

From a certain level of detail on, the ocumentation is in the source, in this case in src/include/access/nbtree.h:

/*
 *  BTPageOpaqueData -- At the end of every page, we store a pointer
 *  to both siblings in the tree.  This is used to do forward/backward
 *  index scans.  The next-page link is also critical for recovery when
 *  a search has navigated to the wrong page due to concurrent page splits
 *  or deletions; see src/backend/access/nbtree/README for more info.
 *
 *  In addition, we store the page's btree level (counting upwards from
 *  zero at a leaf page) as well as some flag bits indicating the page type
 *  and status.  If the page is deleted, a BTDeletedPageData struct is stored
 *  in the page's tuple area, while a standard BTPageOpaqueData struct is
 *  stored in the page special area.
 *
 *  We also store a "vacuum cycle ID".  When a page is split while VACUUM is
 *  processing the index, a nonzero value associated with the VACUUM run is
 *  stored into both halves of the split page.  (If VACUUM is not running,
 *  both pages receive zero cycleids.)  This allows VACUUM to detect whether
 *  a page was split since it started, with a small probability of false match
 *  if the page was last split some exact multiple of MAX_BT_CYCLE_ID VACUUMs
 *  ago.  Also, during a split, the BTP_SPLIT_END flag is cleared in the left
 *  (original) page, and set in the right page, but only if the next page
 *  to its right has a different cycleid.
 *
 *  NOTE: the BTP_LEAF flag bit is redundant since level==0 could be tested
 *  instead.
 *
 *  NOTE: the btpo_level field used to be a union type in order to allow
 *  deleted pages to store a 32-bit safexid in the same field.  We now store
 *  64-bit/full safexid values using BTDeletedPageData instead.
 */

typedef struct BTPageOpaqueData
{
    BlockNumber btpo_prev;      /* left sibling, or P_NONE if leftmost */
    BlockNumber btpo_next;      /* right sibling, or P_NONE if rightmost */
    uint32      btpo_level;     /* tree level --- zero for leaf pages */
    uint16      btpo_flags;     /* flag bits, see below */
    BTCycleId   btpo_cycleid;   /* vacuum cycle ID of latest split */
} BTPageOpaqueData;

The other values are:

  • blkno: the 8kB-block number
  • type: the type of the page (l for “leaf”, i for “internal”, r for “root”, e for “ignored”, d for “deleted leaf”, D for “deleted internal”)
  • live_items: number of live index entries
  • dead_items: number of killed index entries
  • avg_item_size: average size of an index tuple
  • page_size: page size (always 8kB)
  • free_size: free space in the page
  • Related