Home > Software design >  Intermittend connection failures to databases from Google Cloud Run
Intermittend connection failures to databases from Google Cloud Run

Time:08-29

I am running several NodeJS v16.16 applications in Google Cloud Run. These applications connect to a Google Cloud SQL server and a MongoDB Atlas cluster through a VPC connector with a static IP address that is whitelisted for both the MongoDB Atlas cluster and the SQL server.

We are experiencing intermittent connection failures to these databases when new instances are started as if the IP address is not whitelisted. For connections using mongoose we get an MongooseServerSelectionError: Server selection timed out after 30000 ms at NativeConnection.Connection.openUri. And for connections using sequalize we get an Error connecting MySQL: SequelizeConnectionError: connect ETIMEDOUT

Things I have established by testing, logging and checking configurations:

  • Failures seem to happen several consecutive minutes before connecting normally again.
  • When Mongoose fails to connect, Sequalize does not always fail to connect. And the Sequalize connection only fails when the Mongoose connection also fails.
  • Connections are opened from the correct IP address and there is a connection to the internet
    • I have added a http request to https://api.ipify.org?format=json one line before connecting to the databases and logged the results.
    • Cloud Run application is configured to route all egress traffic through the VPC connector
  • The VPC connector is not overburdened
    • The connector is configured to start up to 10 instances, only 2 are active.
    • Traffic does not go over several KiB/s
  • The MongoDB Atlas cluster has enough room for new connections
    • The cluster is capable of >1500 connections per replica and there are only ~150 connections on the primary and ~50 connections on the secondaries currently.
    • Replica CPU's do not go over 25% and average to ~5%
  • I do not seem to be having any issues connecting to the MySQL database or MongoDB cluster using client applications on my laptop from within our (also whitelisted) VPN.

The intermittent connection failures are an issue as when this happens an instance fails to start and the user gets a 503 error as a response.

Versions:

  • Node: v16.16.0
  • Mongoose: v6.4.0
  • Sequelize: v6.21.0

CodePudding user response:

We have found the issue. We were initializing the connection to our database right after we started listening for incoming requests (example below). We changed this to connect to the database before listening for requests and haven't had an error since. Why exactly this is happening is unknown but I figure it has something to do with the processes that Google Cloud Run uses to prepare an instance for handling requests.

Old code:

// start the Express server
app.listen(port, async () => {
    database.connect();
    console.log("Server started!");
});

New code:

database.connect().then(() => {
    // start the Express server
    app.listen(port, async () => {
        console.log("Server started!");
    });
});
  • Related