Home > front end >  COSMOS DB Unique index constraint violation after item was removed by TTL
COSMOS DB Unique index constraint violation after item was removed by TTL

Time:04-07

I am creating a distributed lock library using azure-spring-boot-starter-cosmos

My library has two methods: public void lockResource(String resourceUniqueIdentifier) and
public void unlockResource(String resourceUniqueIdentifier)

The lockResource() method will receive the resourceUniqueIdentifier of the resource I want to lock, create a Lock() instance and save it in the db.

If the lock already exists, (by setting /appName as partition key and /lockedResourceId as unique key in the azure portal), the lockRepository.save() method will throw an exception with 409 status code (conflict since there is already an entity with the same partition key and unique key)

So in order to acquire the lock the one who acquired it previously needs to call unlockResource(String resourceUniqueIdentifier) OR the ttl on the resource needs to expire (I have also set a ttl field on the Lock dto and enabled it in the azure portal)

My logic will try to acquire the lock using a while(!isNewLock(lock) && isNotMaximumRetries(retries, resourceUniqueIdentifier)). Full code below

Cosmos Config : Direct Mode with "session" consistency level

The problem: Even if the lock for a certain resource is removed from the database because of the TTL EXPIRATION (assuming some other service/thread acquired it earlier) , when trying to acquire the lock again, it will still throw a CosmosAccessException with 409 status code. ("Unique index constraint violation"). For me it seems that although the lock is removed from the db (since I checked) it still has some remaining information regarding that lock somewhere.

LOCK DTO (I have not added the getters and setters):

@Container(containerName = "distributed-lock", timeToLive = -1, autoCreateContainer = false)

public class Lock {

@Id
@GeneratedValue
private String id;

private  String lockedResourceId;

@PartitionKey
private  String appName;

private  Integer ttl;

private long time;

public Lock(String id, String lockedResourceId, String appName, Integer ttl, long time) {
    this.id = id;
    this.lockedResourceId = lockedResourceId;
    this.appName = appName;
    this.ttl = ttl;
    this.time = time; //todo remove
}

}

LockService:

 public void lockResource(String resourceUniqueIdentifier) {
    var retries = 0;
    var lock = new Lock(UUID.randomUUID().toString(), resourceUniqueIdentifier, appName, lockProperties.getTtl(), Instant.now().toEpochMilli());

    while(!isNewLock(lock) && isNotMaximumRetries(retries, resourceUniqueIdentifier)) {
        logger.info(I9001.getMessage(), lock.getLockedResourceId(), retries);
        waitToUnlock();
        retries  ;
    }
}




private void waitToUnlock() {
    try {
        Thread.sleep(lockProperties.getRetryInterval());
    } catch (InterruptedException e) {
        throw new RuntimeException("Interrupted exception while waiting to retry lock", e);
    }
}



private boolean isNewLock(Lock lockResource) {
    try {
        lockResource.setId(UUID.randomUUID().toString());
        var lock = lockRepository.save(lockResource);
        logger.info(LoggingUtil.X900.getMessage(), lock.getId());
        logger.info("SAVED LOCK WITH ID: {}", lock.getId());
        var sameLock = lockRepository.getByLockedResourceIdAndAppName(lockResource.getLockedResourceId(), lockResource.getAppName());
        logger.info("TESTED SAVED LOCK WITH ID: {}, UniqueId: {}", sameLock.getId(), sameLock.getLockedResourceId());

        return true;
    } catch (CosmosAccessException cosmosAccessException) {
        if (cosmosAccessException.getCosmosException().getStatusCode() == CONFLICT_STATUS_CODE) {
            lockResource.setTime(Instant.now().toEpochMilli());
            logger.info(LoggingUtil.X900.getMessage(), lockResource.getLockedResourceId(), lockResource.getId());
            var alreadyExisting = lockRepository.getByLockedResourceIdAndAppName(lockResource.getLockedResourceId(), lockResource.getAppName());
            logger.info("Retrieved duplicate with resId {} and id {}", alreadyExisting, alreadyExisting);

            return false;
        }
       else throw cosmosAccessException;
    }
}

CodePudding user response:

The problem is that when a resource's time to leave expires, cosmos only does a partial delete. If someone makes a call for the resource, it won't be returned since the ttl expired. But if you want to save the same resource after a small amount of time (having some constraints set on the container such as uniqueKey and partitionKey) you might receive a 409 status code since the data will be completely deleted when there will be enough RUs (resource units) available to do so.

https://docs.microsoft.com/bs-latn-ba/azure/cosmos-db/sql/time-to-live?view=sql-server-ver15

  • Related