I'm using App engine to concurrently handle a number of long running tasks (therefore I need to use basic scaling).
I noticed with one instance, only 8 tasks can be handled simultaneously (consistent with the number of workers for a B4 instance). For the ninth task I receive:
POST 503: Request was aborted after waiting too long to attempt to service your request.
How can I handle more task than this simultaneously without adding more instances?
CodePudding user response:
As a best practice, the number of workers you specify should match the instance class of your App Engine app, but you can change it by modifying the number of workers in the entrypoint as in the example below and try and see if it works for you.
entrypoint: gunicorn -b :8080 -w 2 main:app
Consider that a service with basic scaling is configured by setting the maximum number of instances in the max_instances
parameter of the basic_scaling
setting. You can control the number of live instance scales with the processing volume by changing to manual scaling.
If you use basic scaling, App Engine attempts to keep your cost low, even though that may result in higher latency as the volume of incoming requests increases.
If you tune the scaling settings to reduce costs by minimizing idle instances, then you run the risk of seeing latency spikes if the load increases unexpectedly.
Basic scaling type is designed to minimize costs at the expense of latency. Your code needs to scale the number of workers based on processing volume. If your code does not handle scaling, you risk wasting computing resources if there are no tasks to process; you also risk latency if you have too many tasks to process. A good way to speed up requests is to make use of multiple caching layers.
This article is helpful to handle the instance settings and modify it to get the desired performance.
CodePudding user response:
Have you tried increasing max_concurrent_requests
in your app.yaml
? It should be defaulting to being able to handle 10 requests at a time.
https://cloud.google.com/appengine/docs/standard/python3/config/appref#max_concurrent_requests