Home > Software engineering >  Using subprocess to automate repeated executions of computationally intensive program
Using subprocess to automate repeated executions of computationally intensive program

Time:06-01

What I am trying to do

I am using a program called MESA (https://docs.mesastar.org/en/latest/index.html) and the relevant steps for each run are:

  1. Edit a few lines with input parameters in a text file
  2. Execute the (bash) shell command “./mk”
  3. Execute the (bash) shell command “./rn”

After successful completion of rn these steps are repeated for each iteration.

My implementation

In order to automate these steps I came up with the following program:

import subprocess 

inputs[n][5] #2d array imported from csv

for i in range(len(inputs)):

    #read data
    with open('inlist', 'r', encoding='utf-8') as file:
            data = file.readlines()

    #lines to change
    data[ 73] = “   RSP_mass = ”   inputs[i][0]   “d0\n”
    data[ 74] = “   RSP_Teff = ”   inputs[i][1]   “d0\n”
    data[ 75] = “   RSP_L = ”  inputs[i][2]   “d0\n”

    data[ 99] = “   log_directory = 'LOGS/”   inputs[i][3]   “'\n”
    data[100] = “   photo_directory = 'PHOTOS/”   inputs[i][4]   “'\n”

    #write data
    with open('inlist', 'r', encoding = 'utf-8') as file:
        file.writelines()

    #running MESA
    subprocess.run(“./mk”)
    subprocess.run(“./rn”, stdout = subprocess.PIPE)

Issue 1:

Since MESA is very computationally intensive (uses up all of the available 16 threads) and already takes up to 2 ½ - 3 hours per run, I am quite worried about possible performance issues. Due to the long run time per run, its also quite difficult to benchmark.

Is there a better solution available, that I have missed?

Issue 2: During a run MESA outputs a little less than 1000 lines to stdout, which I assume will cause quite a slow down if running via subprocess. The easiest way would be of course to just disable any output, however it is quite useful to be able to check the evolution process during runs, so I would like to keep it if possible. From this thread Python: Reading a subprocess' stdout without printing to a file, I have already learned that stdout=subprocess.PIPE would be the fastest way of doing so. The storing of the output data is already handled by MESA itself. Is this a good solution in regards to performance?

Issue 3: This is the least important of the issues, however it might affect the implementation of the prior issues, so I thought I would ask about it as well. Is it possible to define a custom keyboard interrupt, which doesn’t terminate the program immediately, but only once the next run has completed? Based on the thread How to generate keyboard events? I would assume the keyboard library would be best suited for Ubuntu.

CodePudding user response:

Repeatedly reading and rewriting the input file is clumsy and inefficient, and anyway, you can't write to it when you open it in read-only mode ('r').

I would instead read a template file, once, then write the actual configuration file based on that. (Python has a separate Template class in the standard library which would perhaps be worth looking into, but this is simple enough to write from scratch.)

A subprocess simply leaves Python completely out of the picture, so running your tasks from the shell should work the same as running them from Python.

If you have no reason to capture the output from the process, just let it spill onto the user's terminal directly. Not specifying anything for stdout= and stderr= in the subprocess call achieves that.

import subprocess 

# inputs[n][5] #2d array imported from csv

with open('template', 'r', encoding='utf-8') as file:
    data = file.readlines()

for inp in inputs:
    data[ 73] = f"   RSP_mass = {inp[0]}d0\n"
    data[ 74] = f"   RSP_Teff = {inp[1]}d0\n"
    data[ 75] = f"   RSP_L = {inp[2]}d0\n"

    data[ 99] = f"   log_directory = 'LOGS/{inp[3]}'\n"
    data[100] = f"   photo_directory = 'PHOTOS/{inp[4]}'\n"

    with open('inlist', 'w', encoding = 'utf-8') as file:
        file.writelines()

    subprocess.run("./mk", check=True)
    subprocess.run("./rn", check=True)

Notice how this now reads from a file called template, once outside the loop, and then writes ('w') to inlist repeatedly. I also fixed the loop to be a bit more idiomatic, and changed the curly double quotes to proper ASCII double quotes. The replacements now use f-strings for (IMHO) improved legibility. Down near the end, the check=True keyword argument to subprocess.run instructs Python to raise an error if the subprocess fails.

The keyboard interrupt idea sounds unnecessarily challenging. You can add a signal handler to selectively ignore some signals, but a much simpler solution would be to just check whether any regular key (or a specific one; say q) has been pressed within the loop. See e.g. How to detect key presses?

  • Related