Home > OS >  DynamoDB Streams: How to specify a timestamp from where consumers start reading data?
DynamoDB Streams: How to specify a timestamp from where consumers start reading data?

Time:10-13

I have an application (both Lambda & a microservice) that reads from DynamoDB streams.

Is it possible to define a timestamp from where the application starts reading the data?

CodePudding user response:

Defining timestamp is not a data access pattern for DynamoDb streams.

Based on the documentation, the only available data access pattern is by using shard identifiers.

There might be though a way to use the halving interval (aka bisection) method to lookup shard records and their ApproximateCreationDateTime.

CodePudding user response:

After re-reading this question, I believe what you are asking for is a 'start location in the dynamo' from where the lambda will start reading data.

The answer to this is no, because that is not how streams work. dynamo streams are not I/O streams of data to your lambda, but rather batched events that are collected into a single JSON event that is sent to your lambda when its conditions (amount of events or time passed) is met. You have some options like TRIM_HORIZON and such that give you some control over what events are sent and where it "starts' but this is not a 'start in the middle of the stream' sort of operation. These are single json events sent as they are generated.

It really depends on your Use Case here but Im guessing you want to be able to add a bunch of items to the dynamo and NOT have those trigger the Lambda, then at a certain point have the items begin to trigger the lambda.

If this is the case, you have two options:

  1. Add an attribute to the items you don't want to process. Have the lambda check the event in the stream for that attribute, and if it finds it, ignore that event.

Or 2) use your SDK for your language to turn the stream on and off.

option 1 is far less complicated. And probably the far better option.

  • Related