My code needs to run on different versions of SQLite. I'm using PHP's SQLite3 extension. Various servers in my user base have different SQLite versions installed, 3.7 being the earliest so far.
My code needs to upsert (update row, insert if not present). Everything is wrapped in transactions and name
TEXT column has a unique index. I can't make name
primary key because WITHOUT ROWID only became available with version 3.8.2, and I need to support earlier versions.
I started with a prepared upsert statement:
INSERT INTO tbl (name, value) VALUES (:n, :v)
ON CONFLICT(name) DO UPDATE SET value=excluded.value;
This doesn't work prior to version 3.24. So I switched to two statements with a call to .changes() in between. Pseudocode:
UPDATE tbl SET VALUE = :v WHERE name = :n;
if (0 === conn.changes()) {
INSERT INTO tbl (name, value) VALUES (:n, :v);
}
Is the "UPSERT … ON CONFLICT …" syntax significantly faster? Is it worthwhile to write conditional code to use it when SQLite version allows? Or does the code with .changes() call between statements perform adequately even in newer versions? I care about getting good performance from whatever SQLite version my user has available.
CodePudding user response:
name
must be a primary key or have an unique index.
The UPSERT will do an index search to find if :n
is already present, and then will read a page from the table and write the same page back (either to update the value or to add a new row). If a new row is added, then the index must also be updated.
The UPDATE INSERT combo will also do an index search to find if :n
is already present. If it is, it will read a page from the table and write the same page back to update the value, as with the UPSERT.
If it is not, then the INSERT will do the same index search again (which should be fast as the needed pages would probably be in the cache) and then perform the insert and update the index as with the UPSERT.
So, the only difference is an index search (which is one of the fastest operations) on pages that should already be in the cache and only if the row is to be inserted.
This should not be a significant difference, unless your table has billions of records.