Why does the way Postgres handles XID assignments determine the overhead of a READ/WRITE transaction-CodePudding

I don't understand this comment:

My understanding is that read-write transactions carry some overhead, but that you don't incur this overhead until you actually write something. In other words, in terms of performance, a READ ONLY transaction should be the same as a READ WRITE transaction which only contains reads. This stems from the way Postgres handles XID assignment (some info on this here).

The link in the comment states:

"Transactions and subtransactions are assigned permanent XIDs only when/if they first do something that requires one --- typically, insert/update/delete a tuple, though there are a few other places that need an XID assigned."

Is this the key point? That is, if a READ/WRITE transaction only has reads, then an XID isn't assigned, and assigning an XID would otherwise account for the overhead difference between a READ/WRITE transaction with no writes and a READ ONLY transaction.

Does this mean that other databases assign an XID even if no rows are changed, removed, or added?

CodePudding user response：

The overhead difference is related to how read-only and read-write transactions are defined, and how permanent XIDs are assigned in PostgreSQL.

A transaction is defined as virtual transaction (aka read-only) and does not get assigned a true XID until it does a data modification operation on the database. Virtual transactions do not affect the visibility of tuples (nevertheless, they do trigger the pruning of dead tuples in the page depending on the free page-size left, different topic). No impact on visibility means no impact on snapshot isolation. And no need for the assignment of a true XID -- which normally requires synchronization of internal processes, page writes (xmin, xmax, and hint-bits related to those), additional IO, etc. This is the extra overhead. You can self-experiment with this by starting a transaction block and observing no-permanent_XID assignment until a DML statement is executed by using a built-in function (for details, https://pgpedia.info/p/pg_current_xact_id_if_assigned.html):

postgres=# begin;
BEGIN
postgres=*# select pg_current_xact_id_if_assigned();
 pg_current_xact_id_if_assigned 
--------------------------------
                               
(1 row)

A virtual_XID assignment is still done for read-only transactions as well. But assigned IDs are memory-only, local to the process, and temporary; which makes them much less expensive.

When it comes to other DBMSs; MS SQL also differentiates between different types of transactions and if I am not mistaken they are all common in how they are assigned: https://learn.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-tran-active-transactions-transact-sql?view=azuresqldb-current

CodePudding user response：

Yes, that is the case: if you commit a transaction that did not request a new transaction ID, hardly anything happens (unless you created a WITH HOLD cursor). If the transaction got a transaction ID, a COMMIT record is written to WAL, and WAL is flushed to disk.