My understanding: a read replica database exists to allow read volumes to scale.
So far, so good, lots of copies to read from - ok, that makes sense, share the volume of reads between a bunch of copies.
However, the things I'm reading seem to imply "tada! magic fast copies!". How are the copies faster, as surely they must also be burdened by the same amount of writing as the main db in order that they remain in sync?
CodePudding user response:
How are the copies faster, as surely they must also be burdened by the same amount of writing as the main db in order that they remain in sync?
Good question.
First, the writes to the replicas may be more efficient than the writes to the primary if the replicas are maintained by replaying the Write-Ahead Logs into the secondaries (sometimes called a "physical replica"), instead of replaying the queries into the secondaries (sometimes called a "logical replica"). A physical replica doesn't need to do any query processing to stay in sync, and may not need to read the target database blocks/pages into memory in order to apply the changes, leaving more of the memory and CPU free to process read requests.
Even a logical replica might be able to apply changes cheaper on a replica as a query on the primary of the form
update t set status = 'a' where status = 'b'
might get replicated as a series of
update t set status = 'a' where id = ?
saving the replica from having to identify which rows to update.
Second, the secondaries allow the read workload to scale across more physical resources. So total read workload is spread across more servers.