I have configured a DMS migration instance that replicates data from Mysql into a AWS Kinesis stream, but I noticed that when I process the kinesis records I pick up duplicate records.This does not happen for every record.
How do I prevent these duplicate records from being pushed to the kinesis data stream or the S3 bucket?
I'm using a lambda function to process the records, so I thought of adding logic to de-duplicate the data, but I'm not sure how to without persisting the data somewhere. I need to process the data in real-time so persisting the data would not be idle.
Regards Pragesan
CodePudding user response:
There is no built-in way to prevent AWS DMS from inserting duplicate records into kinesis and S3. However, you can use a Lambda function to detect duplicates and prevent them from being inserted.
CodePudding user response:
I added a global counter variable that stores the pk of each record,so each invocation checks the previous pk value,and if it is different I insert the value.