I am trying to implement some kind of sync-service.
Two clients with different user-agents may POST/PATCH
to /sync/user/{user_id}/resource
at the same time with the same user_id
. sync
should update data for user with id={user_id}
in DB.
func (syncServer *SyncServer) Upload(w http.ResponseWriter, r *http.Request, ps httprouter.Params) {
userID := ps.ByName("user_id"))
if isAlreadyProcessedForUser(userID) {
w.WriteHeader(http.StatusConflict)
return
}
...
syncServer.db.Update(userID, data)
...
}
The problem is I have no idea how to correctly decline one Upload
when another one is still processing request for the same user_id
. I think the idea to use mutex.Lock()
is bad because I will use many pods for this handler and if Upload
is called on different pods, it won't help me out anyway. What a synchronization method can I use to solve this problem? Should I use some additional field in DB? I am asking to give me any idea!
CodePudding user response:
There're many ways to do this (distributed locking) in a distributed system, some I can come up with by far:
- Use a
redis
(or any other similar services) lock . Then you can lock eachuser_id
on receiving the first request and reject other requests for the sameuser_id
because you'll failed to lock it. Redis lock generally has expiration time so you won't deadlock it. Ref: https://redis.io/docs/reference/patterns/distributed-locks/ - Use a database lock. You should be careful to use a database lock, but a simple way to do this is with unique index: Create a
uploading
record withunique(user_id)
constraints before upload, and delete it after upload. It's possible to forget/failed to delete the record and cause deadlock, so you might want to add anotherexpired_at
field to the record, check & drop it before uploading. - (Specific to the question's scenario) Use a unique constraints on
(user_id, upload_status)
. This is called partial index, and you'll only check this unique index whenupload_stats = 'uploading'
. Then you can create anuploading
record on each request, and reject the other request. Expiration is also needed so you need to track thestart_time
of uploading and cleanup long-time uploading record. If you don't need to re-claim the disk space you can simply mark the record asfailed
, by this you can also track when & how these uploads failed in database.
CAUTION:
- It seems that you're using Kubernetes, so any non-distributed lock should be used cautiously, depends on the level of consistency you want to acquire. Pods are volatile and it's hard to rely on local information and achieve consistency because they might be duplicated/killed/re-scheduled to another machine. This also applies to any other platforms with auto scaling or scheduling mechanisms.
- A syncing process between several clients owned by one user and server needs to handle at least the request ordering, request deduplicating, and eventual consistency issue (e.g. Google Doc can support many people editing at the same time). There're some generic algorithms (like operational transformation) but it depends on your specific use case.