I have a NodeJS service hosted on Google Cloud Run that uses Socket IO to communicate back to the browser client whenever the service instance is running.
However, I am noticing something weird.
The weird thing is that sometimes when the server emits a socket event to the client, the client gets the event immediately but on some other occasions the event never gets to the client. This happens so randomly that it's really hard to reproduce where the disconnection is coming from.
Below is my client code:
client_socket.js
import io from "socket.io-client";
const socketUrl = EndPoints.SOCKET_IO_BASE;
let socketOptions = { transports: ["websocket"] }
let socket;
if (!socket) {
socket = io(socketUrl, socketOptions);
socket.on('connect', () => {
console.log(`Connected to Server`);
})
socket.on('disconnect', () => {
console.log(`Disconnected from Server`); //This never gets called when the Cloud Run service instance is running, so I can assume a disconnect never happened.
})
}
export default socket;
Funny enough, a disconnect event was never fired back to the client while the Cloud Run service instance is running, meaning the client is still connected to the service. So, it's really weird that on some occasions it doesn't get events from the server even while been connected.
Please note that on the Google Cloud Run service side I have set the timeout of my service to 3600s which is more than good enough to ensure the service is running long enough to keep the socket connection in place.
CodePudding user response:
Based on this documentation on best practices:
The most difficult part of creating WebSockets services on Cloud Run is synchronizing data between multiple Cloud Run container instances. This is difficult because of the autoscaling and stateless nature of container instances, and because of the limits for concurrency and request timeouts.
One suggestion is by using session affinity. If enabled, Cloud Run will route sequential requests for a given client to the same container instance and will use a session affinity cookie with a TTL of 30 days. It will also inspect the value to identify requests by the same client and direct the requests to the same instance. Still, it is not guaranteed that it will be serviced by the same instance.
Also, this feature is still in the preview phase and may change while still in development.
It is recommended to use external data storage such as database (Cloud SQL) or external message queue (Redis Pub/Sub/Memorystore/Firestore real-time updates) that can deliver updates to all instances over connections initiated by the container instance.