Home > Software design >  How posgreSQL marks frozen rows?
How posgreSQL marks frozen rows?

Time:09-25

I'm studying the VACUUM command. In particular how it solves the wrap around problem.

Basically every transaction has associated a transaction ID. Every row has associated the transaction ID of the last transaction that has updated/inserted it. The transaction ID associated to a specific row can be seen with the system column xmin. Rows updated/inserted with the same transaction will have the same transaction ID (XID).

These IDs control the visibility of that row to other concurrent transactions. A transaction with ID 50, can see all the row with transactionID < 50.

The transaction ID can have more the 4 billion of values (32 bit). The problem is that when we reach this value the transaction ID need to restart and that means that suddenly for this transaction all the rows are invisible (it has the lowest transaction ID) even if this rows were added in the past. This is called wrap around problem.

I know that vacuum solve this problem marking old rows as frozen. How can I see rows that have been frozen? xmin will have a specific value?

CodePudding user response:

This information is stored in the t_infomask of the tuple header (HeapTupleHeaderData). See the following definitions from src/include/access/htup_details.h:

/*
 * information stored in t_infomask:
 */
[...]
#define HEAP_XMIN_COMMITTED     0x0100  /* t_xmin committed */
#define HEAP_XMIN_INVALID       0x0200  /* t_xmin invalid/aborted */
#define HEAP_XMIN_FROZEN        (HEAP_XMIN_COMMITTED|HEAP_XMIN_INVALID)

You cannot access that information directly via SQL, but if you use the pageinspect extension, you can use the heap_page_items and heap_tuple_infomask_flags functions to read the information from a raw 8kB page.

  • Related