I've a project in gRPC where the main.py
spawns grpc servers as subprocesses.
Also in the project I've settings.py
that contains some configurations, like:
some_config = {"foo": "bar"}
In some files (used by different processes) I have:
import settings
...
the value of settings.some_config is read
In the main process I've a listener that updates some_config
on demand, for example:
settings.some_config = new_value
I noticed that while changing settings.some_config
value in the main process, it was not changed in a subprocess that I checked, and remained the old value.
I want that all subprocess would always have the most up-to-date value of settings.some_config
.
A solution I thought about - passing a queue or a pipe to each sub process, and when some_config
changes in the main process, I can send the new data through the queue/pipe to each subprocess.
But how can I alert it to assign new value to settings.some_config
in the subprocess? Should I use a listener in each subprocesses so that when a notification arrives it will do:
settings.some_config = new_value
Would this work? The end goal is to have settings.some_config
value the most up to date across all modules/process without restarting the server. I'm also not sure if it would work since it could be that each module keeps the value of settings.some_config
which was previously imported in its cached memory.
UPDATE
I took on Charchit's solution and adjusted it to my requirements, so we have:
from multiprocessing.managers import BaseManager, NamespaceProxy
from multiprocessing import Process
import settings
import time
def get_settings():
return settings
def run(proxy_settings):
settings = proxy_settings # So the module settings becomes the proxy object
if __name__ == '__main__':
BaseManager.register('get_settings', get_settings, proxytype=NamespaceProxy)
manager = BaseManager()
manager.start()
settings = manager.get_settings()
p = Process(target=run, args=(settings, ))
p.start()
Few questions:
Should an entire module (settings
) be the target of a proxy object? Is it standard to do so?
There is a lot of magic here, for instance, Is the simple answer, to how it works is that now the module settings is a shared proxy object? So when a sub process reads settings.some_config
, it would actually read the value from manager?
Are there any side effects I should be aware of?
Should I be using locks when I change any value in settings in the main process?
CodePudding user response:
The easiest way to do this is to share the module with a manager:
from multiprocessing.managers import BaseManager, NamespaceProxy
from multiprocessing import Process
import settings
import time
def get_settings():
return settings
def run(settings):
for _ in range(2):
print("Inside subprocess, the value is", settings.some_config)
time.sleep(3)
if __name__ == '__main__':
BaseManager.register('get_settings', get_settings, proxytype=NamespaceProxy)
manager = BaseManager()
manager.start()
settings = manager.get_settings()
p = Process(target=run, args=(settings, ))
p.start()
time.sleep(1)
settings.some_config = {'changed': 'value'}
p.join()
Doing so would mean that you don't have to handle informing subprocesses that there is a change in the value, they will just simply know because they are receiving the value from the manager process which handles this automatically.
Output
Inside subprocess, the value is {'foo': 'bar'}
Inside subprocess, the value is {'changed': 'value'}
However, if you have many subprocesses running then this might become slow (more connections to manager = less speed). If this bothers you then I recommend you to do it the way you stated in the description — i.e, "passing a queue or a pipe to each sub process". To make sure that the child process updates it's value as soon fast as it can after you pass the value inside the queue, you can spawn a thread inside the subprocess which constantly polls whether a value in the queue exists, and if it does, it updates the process's settings value to the one provided in the queue. Just make sure to run the thread as a daemon, or explicitly agree on an exit condition.
CodePudding user response:
Charchit's solution of creating a specialized managed object is more complicated than it needs to be. If the assumption is that the configuration is being stored as a dictionary, then just use a 'multiprocessing.managers.DictProxy'
instance returned by the multiprocessor.Manager().dict
method. This also allows you to update individual keys rather than having to do your update by setting a completely new dictionary value:
from multiprocessing import Process, Manager
import time
def get_settings(manager):
return manager.dict({'foo': 'bar', 'x': 17})
def run(settings):
for _ in range(2):
print("Inside subprocess, the value is", settings)
time.sleep(3)
if __name__ == '__main__':
manager = Manager()
settings = get_settings(manager)
p = Process(target=run, args=(settings, ))
p.start()
time.sleep(1)
settings['foo'] = 'changed bar'
p.join()
Prints:
Inside subprocess, the value is {'foo': 'bar', 'x': 17}
Inside subprocess, the value is {'foo': 'changed bar', 'x': 17}