we're using Azure Cloud Functions with the Java SDK and connect to the Cosmos DB using the following Java API
CosmosClient client = new CosmosClientBuilder()
.endpoint("https://my-cosmos-project-xyz.documents.azure.com:443/")
.key(key)
.consistencyLevel(ConsistencyLevel.SESSION)
.buildClient();
This buildClient() starts a connection to CosmosDB, which takes 2 to 3 seconds.
The subsequent database queries using that client are fast.
Only this first setup of the connection is pretty slow.
We keep the CosmosClient as a static variable, so we can reuse it between multiple http requests that go to our function.
But once the function is getting cold (when Azure shuts it down after a few minutes unused), the static variable gets lost and will be reconnected, when the function is started up again.
Is there a way to make this initial connection to cosmos DB faster?
Or do you think we need to increase the time a function stays online, if we need faster response times?
CodePudding user response:
This is a expected behavior, see https://youtu.be/McZIQhZpvew?t=850.
The first request a client does needs to go through a warm-up step. This warm-up consists of fetching the account information, container information, routing and partitioning information in order to know where to route the requests (as you experienced, further requests do not get this extra latency). Hence the importance of maintaining a singleton instance.
In some Functions plan (Consumption) instances get de-provisioned if there is no activity, in which case, any existing instance of the client is destroyed, so when a new instance is provisioned, your first request will pay this warm-up cost.
There are currently no workaround I'm aware of in the Java SDK but this should not affect your P99 latency since it's just the first request on a cold client.
Hope this and the video help with the reason.