Home > OS >  Sharing DB client among multiple processes in Python?
Sharing DB client among multiple processes in Python?

Time:11-21

My python application uses concurrent.futures.ProcessPoolExecutor with 5 workers and each process makes multiple database queries.

Between the choice of giving each process its own db client, or alternatively , making all process to share a single client, which is considered more safe and conventional?

CodePudding user response:

It is better to use multithreading or asynchronous approach instead of multiprocessing because it will consume fewer resources. That way you could use a single db connection, but I would recommend creating a separate session for each worker or coroutine to avoid some exceptions or problems with locking.

CodePudding user response:

Short answer: Give each process (that needs it) its own db client.

Long answer: What problem are you trying to solve?

Sharing a DB client between processes basically doesn't happen; you'd have to have the one process which does have the DB client proxy the queries from the others, using more-or-less your own protocol. That can have benefits, if that protocol is specific to your application, but it will add complexity: you'll now have two different kinds of workers in your program, rather than just one kind, plus the protocol between them. You'd want to make sure that the benefits outweigh the additional complexity.

Sharing a DB client between threads is usually possible; you'd have to check the documentation to see which objects and operations are "thread-safe". However, since your application is otherwise CPU-heavy, threading is not suitable, due to Python limitations (the GIL).

At the same time, there's little cost to having a DB client in each process; you will in any case need some sort of client. There isn't going to be much more IO, since that's mostly based on the total number of queries and amount of data, regardless of whether that comes from one process or gets spread among several. The only additional IO will be in the login, and that's not much.

If you're running out of connections at the database, you can either tune/upgrade your database for more connections, or use a separate off-the-shelf "connection pooler" to share them; that's likely to be much better than trying to implement a connection pooler from scratch.

More generally, and this applies well beyond this particular question, it's often better to combine several off-the-shelf pieces in a straightforward way, than it is to try to put together a custom complex piece that does the whole thing all at once.

So, what problem are you trying to solve?

  • Related