Home > OS >  Python multiprocessing process just stops my code of running when i call process.start()
Python multiprocessing process just stops my code of running when i call process.start()

Time:10-02

Im trying to make an genetic algorithm runs the candidates parallelly using multiprocessing. So i did a code like this

import multiprocessing as mp

...

parents = []
queue = mp.Queue(maxsize=poolSize - 1)
processes = []
for _ in range(poolSize - 1):
    processes.append(mp.Process(target=generate_parent, args=(queue,)))
for process in processes:
    process.start()
for process in processes:
    process.join()
for _ in range(poolSize - 1):
    parents.append(queue.get())

Something gone wrong and I just don't know what. When I tryied debugging the code I saw when it gets to "process.start()" the execution just stops as if it has got to a "while True: continue". The same happens when I try to execute it normally, the code stucks at some point but it doesn't stops the process or raises any error.

I'm newbie for multiprocessing and general parallelism stuff and i would be glad if someone could help me.

The whole code is here: https://github.com/estevaopbs/Molpro_tools

This specific problem is in genetic.py line 144. (I know there are some another problems in the code. I'm solving it and they are not supposed to impact in this specific problem.)

CodePudding user response:

It looks like the problem is here (and if it's not, it's still a trouble spot):

    def fn_generate_parent(queue=None):
        while True:
            try:
                parent = Chromosome()
                parent.Genes = create_lookup[random.choices(create_methods.methods, create_methods.rate)[0]]\
                    (first_molecule)
                parent.Fitness = get_fitness(parent.Genes, fitness_param, threads_per_calculation)
                break
            except:
                os.remove(f'data/{parent.Genes.__hash__()}.inp')
                os.remove(f'data/{parent.Genes.__hash__()}.out')
                os.remove(f'data/{parent.Genes.__hash__()}.xml')
                continue

If there's an exception in the try block - an undeclared variable, unknown method, division by 0 - the while True continues as a silent infinite loop.

The problem is that if something goes wrong, the code doesn't take any corrective action or stop, it just quietly continues and possibly keeps encountering the same error. I would remove the continue or at least augment it with some error messages, logging, or maybe a counter that only retries a few times.

  • Related