MongoDB Atlas : Resume of change stream was not possible, as the resume point may no longer be in th-CodePudding

I'm trying to use a mongo client called monstache to read and synchronise data between mongodb and elasticsearch.

But when I run monstache it fails showing the below error continuously :

Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
Jan 26 01:04:59 ip-172-31-1-200.eu-west-3.compute.internal monstache[25689]: ERROR 2023/01/26 01:04:59 Error starting change stream. Will retry: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.

I'm a little bit new in mongo administration, here is the check command I ran :

Atlas atlas-lmkye1-shard-0 [primary] test> rs.printReplicationInfo()
actual oplog size
'1782.485107421875 MB'
---
configured oplog size
'1782.485107421875 MB'
---
log length start to end
'513558.99999570847 secs (142.66 hrs)'
---
oplog first event time
'Fri Jan 20 2023 02:45:29 GMT 0000 (Coordinated Universal Time)'
---
oplog last event time
'Thu Jan 26 2023 01:24:48 GMT 0000 (Coordinated Universal Time)'
---
now
'Thu Jan 26 2023 01:24:57 GMT 0000 (Coordinated Universal Time)'
Atlas atlas-lmkye1-shard-0 [primary] test>

Please any idea about could be he problem?

CodePudding user response：

First some background:

A mongodb change stream returns documents from the operations log that the replica set members use to keep in sync.
The oplog is a capped collection, so it will automatically delete the oldest entries to keep below its maximum size.
When you retrieve a batch of documents from a change stream, it includes a resume token, which identifies the point in the oplog where that batch was taken

So if you attempt to resume a change stream, but the resume token refers to a point that is no longer in the oplog, you have no way to know whether or not you have missed anything.

The checks you ran indicate that the oldest event in the oplog is ~6 days old, suggesting that either monstache was stalled for a week, or the resume token it is using is not valid.

CodePudding user response：

Going from here It seemed like the monstache metadata was invalid.

So I droped the collection monstache.monstache used by monstache for its sync operations, restarted the monstache service and it's now syncing without error.