Home > Back-end >  How far can/should I take args from argparse into my functions and classes?
How far can/should I take args from argparse into my functions and classes?

Time:06-24

I write a lot of cli tools in python. Most of my tools have something like this to get arguments:

import argparse

parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('integers', metavar='N', type=int, nargs=' ',
                    help='an integer for the accumulator')
parser.add_argument('--sum', dest='accumulate', action='store_const',
                    const=sum, default=max,
                    help='sum the integers (default: find the max)')

args = parser.parse_args()

I write code that is half data science (we're bioinformaticians). This args bit normally lives in "main.py" then gets passed to some sort of "run experiment" function/method which will often use multiprocessing Pool to break the task apart (by passing off to other functions/classes). So most of the arguments from the command line need to be parsed to the run function then the new process. This architecture is not not up for debate for various reasons.

I'm cognizant of this https://python-docs.readthedocs.io/en/latest/writing/style.html

BAD

def make_complex(*args):
    x, y = args
    return dict(**locals())

GOOD

def make_complex(x, y):
    return {'x': x, 'y': y}

My tools often have 10-20 parameters stored in args, from argparse. So the question is - should I parse them packaged up as args or unpack them and pass them individually? If I explicitly parse each of them to each function/class that they are intended to end up in it means I end up with a lot of redundant code (at least one function will have a massive parameter list that is identical to args). Conversely, if I just pass args around it goes against the zen...

CodePudding user response:

Here's my opinion.

args is a Namespace object, a simple class that holds the values as attributes. It's easily converted to a dict with vars.

Internally argparse code uses a lot of foo(*args, **kwargs), signature, especially for the add_argument method. This gives a lot of flexibility in what parameters it accepts, but also leaves things open for errors.

In the subcommands section, the argparse docs has an example of calling functions with args.func(args). This allows the different func to use different parameters.

Often the args contains control parameters, things like debugging, logging, etc. They aren't the primary parameters, but auxiliary ones that the function may, or may not, use. Passing those through several layers with args (or a dict) can be convenient.

On the other hand if a function might be called from other functions and other user interfaces, it may be a pain to have to create a Namespace like object just to pass in parameters.

In short, if the function (or class) is written primarily for use by the CLI, passing the whole args can be convenient and reasonable. But if the function might be used with different CLI, or other interfaces (such as in an imported module), more explicit positional and keyword parameters are better.

If your code is organized around classes, it may be reasonable to accept the args namespace, and then assign the desired attributes to instance variables. The instance doesn't use args except during initiation.

  • Related