Trying to figure out uWSGI thread/workers configuration-CodePudding

So, I started working with uWSGI for my python application just two days ago and I'm trying to understand the various parameters we specify in an .ini file. This is what my app.ini file currently looks like:

# The following article was referenced while creating this configuration
# https://www.techatbloomberg.com/blog/configuring-uwsgi-production-deployment/
[uwsgi]
strict = true                          ; Only valid uWSGI options are tolerated
master = true                          ; The master uWSGI process is necessary to gracefully re-spawn and pre-fork workers,
                                       ; consolidate logs, and manage many other features
enable-threads = true                  ; To run uWSGI in multithreading mode
vacuum = true                          ; Delete sockets during shutdown
single-interpreter = true              ; Sets only one service per worker process
die-on-term = true                     ; Shutdown when receiving SIGTERM (default is respawn)
need-app = true

;disable-logging = true                 ; By default, uWSGI has rather verbose logging. Ensure that your
;log-4xx = true                         ; application emits concise and meaningful logs. Uncomment these lines
;log-5xx = true                         ; if you want to disable logging

cheaper-algo = busyness
processes = 128                      ; Maximum number of workers allowed
cheaper = 1                          ; Minimum number of workers allowed - default 1
cheaper-initial = 2                  ; Workers created at startup
cheaper-overload = 60                ; Will check busyness every 60 seconds.
cheaper-step = 3                     ; How many workers to spawn at a time

auto-procname = true                 ; Identify the workers
procname-prefix = "rhs-svc "         ; Note the space. uWSGI logs will be prefixed with "rhs-svc"

When I start uWSGI - this is what I see:

[uWSGI] getting INI configuration from app.ini
*** Starting uWSGI 2.0.19.1 (64bit) on [Thu Sep 30 10:49:45 2021] ***
compiled with version: Apple LLVM 12.0.0 (clang-1200.0.32.29) on 29 September 2021 23:55:27
os: Darwin-19.6.0 Darwin Kernel Version 19.6.0: Thu Sep 16 20:58:47 PDT 2021; root:xnu-6153.141.40.1~1/RELEASE_X86_64
nodename: sth-sth-sth
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 12
current working directory: /Users/sth.sth/My-Microservice
detected binary path: /Users/sth.sth/My-Microservice/venv/bin/uwsgi
your processes number limit is 2784
your memory page size is 4096 bytes
detected max file descriptor number: 10240
lock engine: OSX spinlocks
thunder lock: disabled (you can enable it with --thunder-lock)
uWSGI http bound on :9000 fd 4
[busyness] settings: min=25%, max=50%, overload=60, multiplier=10, respawn penalty=2
uwsgi socket 0 bound to TCP address 127.0.0.1:57164 (port auto-assigned) fd 3
Python version: 3.9.6 (default, Jun 29 2021, 06:20:32)  [Clang 12.0.0 (clang-1200.0.32.29)]
Python main interpreter initialized at 0x7fd32b905bf0
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 9403584 bytes (9183 KB) for 128 cores
*** Operational MODE: preforking ***
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7fd32b905bf0 pid: 78422 (default app)
spawned uWSGI master process (pid: 78422)
spawned uWSGI worker 1 (pid: 78423, cores: 1)
spawned uWSGI worker 2 (pid: 78424, cores: 1)
spawned uWSGI http 1 (pid: 78425)

I'm running this on MacOS Catalina with a 6-Core i7 CPU. Why does it say detected cores: 12 when I have just 6? It says process number limit: 2784 - can I really set processes = 128 to processes = 2784? In the docs, it is mentioned that processes = 2 * cpucores is too simple a metric to adhere to. What metrics should I ideally be considering? My application is currently a shell (i.e. no business logic - just in memory stuff get-value-set-value stuff for now, I'm essentially building a template) and we don't expect any DB connections/IO intensive operations for now. How do I determine what's a good thread:process ratio? I apologize if my questions are too basic, but I'm very new to this

CodePudding user response：

Cores vs Processors

First of all, the number of cores is not necessarily the number of processors. On early computers days it was like 1-1, but with modern improvements one processor can offer more than one core. (Check this: https://www.tomshardware.com/news/cpu-core-definition,37658.html). So, if it detected 12 cores, you can use it as the basis for your calculus.

WSGI processes

The number of processes means how many different parallel instances of your web application will be running in that server. WSGI first creates a master process that will coordinate things. Then it bootstraps your application and creates N clones of it (fork). These child forked processes are isolated, they don't share resources. If one process get's unhealthy for any reason (e.g. I/O problems), it can terminate or even be voluntarily killed by the master process while the rest of the clones keep working, so your application is still up and running. When a process is terminated/killed the master process can create another fresh clone to replace it (re-spawn).

It is OK to set the number of processes as a ratio of the available cores. But there's no benefit of increasing it too much. So, you definitely shouldn't set it to the limit (2784). Remember that the operating system will round-robin across all processes to give each one a chance of have some instructions processed. So, if it offers 12 cores and you create like 1000 different processes, you are just putting stress on the system and you'll end up getting the same throughput (or even a worse throughput since there's so much chaos).

Number of threads inside a process

Then we move on to the number of threads. For the sake of simplicity, let's just say that the number of threads means the number of parallel requests each of these child process can handle. While one thread is waiting for a database response to answer a request, another thread could be doing something else to answer another request.

You may say: why do I need multiple threads if I already have multiple processes?

A process is an expensive thing, but threads are just the way you can parallel the workload a process can handle. Imagine that a process is a Coffee Shop, and threads are the number of attendants you have inside. You can spread 10 different Coffee Shops units around the city. If one of them closes, there's still another 9 somewhere else the customer can go. But each shop need an amount of attendants to serve people the best way possible.

How to set these numbers correctly

If you set just a single process with 100 threads, that means that 100 is your concurrency limit. If at some point there's 101 concurrent requests to your application, that last one will have to wait for one of the 100 first to be finished. That's when you start to get an increasing response time for some users. The more the requests are queued, the worse it gets (queuing theory).

Besides that, since you have a single process, if it crashes, all these 100 requests will fail with a server error (500). So, it's wiser to have more processes, let's say 4 processes handling 25 threads each one. You still have the 100 concurrency limit, but your application is more resilient.

It's up to you to get to know your application expected load so you can adjust these numbers properly. When you have external integrations like databases, you have to consider it's limitations also. Let's say a PostgreSQL Server that can handle 100 simultaneous connections. If you have 10 WSGI processes, 40 threads each (with a connection pool of size 40 as well), then there's the possibility that you stress the database with 400 connections, and then you have a big problem, but that's not your case!

So, just use the suggested number of processes (12 * 2 = 24) and set as much threads as needed to offer a certain desired level of concurrency.

If you don't know the expected load, I suggest you to make some sort of performance test that can simulate requests to your application and then you can experiment different loads and settings and check for it's side effects.

Extra: Containers

If you are running your application in a container orchestration platform, like Kubernetes, then you can probably have multiple balanced containers serving the same application. You can even make it dynamic so that the number of containers increases if memory or processing gets over a threshold. That means that on top of all those WSGI fine tuning for a single server, there's also other modern layers of configurations that can help you face peaks and high load scenarios.