Python. Is it fine to pass whole object to function just for it arguments-CodePudding

A class

class Test:
  self.model = model
  self.type = type
  self.version = version
  ...

test = Test()

Functions

def get_type_1(test):
  if test.model == "something" and test.type == "something" and type.version == "something"
    return "value"

def get_type_2(model, type, version):
  if model == "something" and type == "something" and version == "something"
    return "value"

From the perspective of "clean code" which type of function should I use? I couch myself using type_1 when there are more arguments and type_2 where there is 1-2 of them. Which is making a logical mess in my program. Do I need to worry in Python about speed and memory passing class all the time?

CodePudding user response：

Prefer the 1st form, for three reasons.

You're not shadowing the type builtin. (Trivial, could use alternate spelling type_)
More convenient for the caller, and for folks reading the calling code.
Those three things go together. Better to show that, with the representation.

When we speak of (model, type, version), they could be nearly anything. There's no clear relationship among them, and no name to hang documentation upon.

OTOH the object may have well-understood constraints, perhaps "model is never Edsel when version > 3". We can consult the documentation, and the implementation, to understand the class invariants.

Sometimes mutability is a concern.

That is, a caller might have passed in an object with foo(test), and then we're worried that library routine foo might possibly have changed model "Colt" to "Bronco".

Often the docs, implicit or explicit, will make clear that such mutations are out of bounds, they will not happen. To make things very obvious with minimal documentation burden, consider using a named tuple for those three fields in the example.

need to worry in Python about speed and memory passing class all the time?

No.

Python is about clarity of communicating a technical idea to other humans. It is not about speed.

Recall Knuth's advice. If speed was a principal concern, you would have already used cProfile to identify the hot spots that should be implemented in e.g. Rust, cython, or C . Usually that only becomes important when you notice you're often looping more than a thousand or a million times.

Use dis.dis() to disassemble your two functions. Notice that caller1 pushed a single reference to test, while caller2 spent more time and more stack memory pushing three references. Down in the target code, we still need to chase three references, so that's mostly a wash.

If you pass an object with a dozen attributes, of which just three will be used, that's no burden on the bytecode interpreter, the other nine are simply never touched. It can be an intellectual burden on an engineer maintaining the code, who might need to reason about those nine and dismiss them as not a concern.

Another concern that a paranoid caller might have about called library code relates to references. Typically we expect the called routine will not permanently hold a reference (or weakref) on the passed test object, nor on attributes such as test.version or test.version.history_dict. If the library routine will store a reference for a long time, or pass a reference to someone that will store it, well, that's worth documenting. Caller will want to understand memory consumption, leaks, and object lifetime.

CodePudding user response：

I think that better practice to use is get_type_1. It is more self-contained and easier to read and understand. When you pass the entire Test object to the function, you can see all of the relevant information in one place, rather than having to look at the function definition to see what arguments (model, type, version) are being used. This can make it easier to understand the code as for my opinion. Also when you are using the first method get_type_1, you can define the function to only accept certain types of objects as arguments. This can help to ensure that the function is only called with valid arguments and can help to prevent errors. For example, you could define the function to only accept Test objects as arguments, like this:

def get_type_1(test: Test) -> str:
    if test.model == "smth" and test.type == "smth" and type.version == "smth":
        return "val"

In terms of clean code it is generally considered a good idea to use the most self-contained and easy-to-understand is get_type_1 as I've described before.
I guess there is usually no significant difference in terms of speed or memory usage. The speed of a function in Python is generally not affected by whether you pass an entire object or individual arguments to the function. This is because Python is a high-level language and the Python interpreter handles the details of how arguments are passed to functions and how objects are stored in memory