Home > OS >  clear dynamo DB table without specifying any key
clear dynamo DB table without specifying any key

Time:12-05

I want to truncate dynamodb table which can have up to 3 millions to 4 millions of records. what is the best way?

Right now I am using scan which does not give good performance(I have tried to delete only for few records: 3):

DynamoDB dynamoDB = new DynamoDB(amazonDynamoDBClient);
Table table = dynamoDB.getTable("table-test");
ItemCollection<ScanOutcome> resultItems = table.scan();
Iterator<Item> itemsItr = resultItems.iterator();
while(itemsItr.hasNext()){
  Item item = itemsItr.next();
  String itemPk = (String) item.get("PK");
  String itemSk = (String) item.get("SK");
  DeleteItemSpec deleteItemSpec = new DeleteItemSpec().withPrimaryKey("PK", itemPk, "SK", itemSk);
  table.deleteItem(deleteItemSpec);
}

CodePudding user response:

The best way is to delete your table, and create new one of the same name. This is how clearing all data from DynamoDB is usually performed.

CodePudding user response:

As Marcin already answered, the best way is to delete your table and create a new one. It is certainly the cheapest way - because any other way would require scanning the entire table and paying for the read capacity units required to do it.

In some cases, however, you might want to delete old items while the table is still actively used. In that case you can use a Scan like you wanted, but can do it much more efficiently than you did: First, don't run individual DeleteItem requests sequentially, waiting for one delete to complete before asking for the next one... You can send batches of 25 deletes in one BatchWriteItem request. You can also send multiple BatchWriteItem requests in parallel. Finally, for even faster deletion, you can parallelize your Scan to multiple threads or even machines - see the parallel scan section of the DynamoDB documentation. Just don't forget that if you delete items while the table is still actively written to, you need a way to tell old items which you want to delete, from new items that you don't want to delete - as the scan may start producing these new items as well.

Finally, if you find yourself often clearing old data from a table - you should consider whether you can use DynamoDB's TTL feature, where DynamoDB automatically looks for expired items (based on an expiration-time attribute on each item) and deletes them - at no cost to you.

  • Related