I want to get all the data from a table in Dynamo DB in nodejs, this is my code
const READ = async (payload) => {
const params = {
TableName: payload.TableName,
};
let scanResults = [];
let items;
do {
items = await dbClient.scan(params).promise();
items.Items.forEach((item) => scanResults.push(item));
params.ExclusiveStartKey = items.LastEvaluatedKey;
} while (typeof items.LastEvaluatedKey != "undefined");
return scanResults;
};
I implemented this and this is working fine, but our code review tool is flagging red that this is not optimized or causing some memory leak, I just cannot figure out why, I have read somewhere else that scanning API from dynamo DB is not the most efficient way to get all data in node or is there something else that I am missing to optimize this code
CodePudding user response:
DO LIKE THIS ONLY IF YOUR DATA SIZE IS VERY LESS (less than 100 items or data size less than 1MB, that's I prefer and in that case you don't need a
do-while
loop)
Think about the following scenario, What about in case in future, more and more items will add in to DynamoDB table? - This will return all your data and put into the scanResults
variable right? This will impact the memory. Also, DynamoDB scan operation is expensive - in terms of both memory and cost
It's perfectly okay to use SCAN operation if the data is very less. Otherwise, go with pagination (I always prefer this). If there are 1000's of items, then who will look in to all these in a single shot? So use pagination instead.
Lets take another scenario, If your requirement is to retrieve all the data for doing some analytics or aggregation. Then better store the aggregate data upfront into the table (same or different DynamoDB table) as an item or use some analytics database.
If your requirement is something else, elaborate it in the question.