CREATE TABLE table (
id1 uuid,
bucket timestamp,
createdat timestamp,
id2 uuid,
data blob,
primary key((id1, bucket), createdat, id2)
)
with
clustering ORDER BY (createdat ASC, id2 ASC) AND ...
I have this Cassandra table. In a multithreaded environment two threads can insert rows to this table on the same time.
I have a requirement that the table should not have two rows which has same (id1
, bucket
) and data
field having value null
.
It means that only last row of the table (for each id1
and bucket
) could have data
as null
.
In order to achieve that before each insert I query last row of the table and insert new one only if data
field is not null.
This does not work because of race condition. Two threads both check, get last row which does not have data
as null and insert rows having null
so violating the requirement.
I think it's possible to fix this on database level. I need some transaction that first checks the last row and only then insert new one. And nothing should be inserted between these actions.
Can you hint how this can be achieved on Cassandra?
CodePudding user response:
You may be able to get this behavior to work with a Lightweight Transaction.
In your case, it may look something like:
INSERT INTO table (id1,bucket,createdat,id2,data)
VALUES (uuid(),'2021-09-27 13:20',totimestamp(now()),uuid(),'some data')
IF NOT EXISTS;
Basically, this will only perform the write if the key combination does not already exist. This exact solution might not work, but it should point you in the right direction.
CodePudding user response:
There isn't a way to achieve this in Cassandra. As you already stated, you will always run into race conditions.
Apart from lightweight transactions, Cassandra doesn't have locking mechanism because it goes against the goal of high velocity CRUD operations. Cheers!