A class
class Test:
self.model = model
self.type = type
self.version = version
...
test = Test()
Functions
def get_type_1(test):
if test.model == "something" and test.type == "something" and type.version == "something"
return "value"
def get_type_2(model, type, version):
if model == "something" and type == "something" and version == "something"
return "value"
From the perspective of "clean code" which type of function should I use? I couch myself using type_1 when there are more arguments and type_2 where there is 1-2 of them. Which is making a logical mess in my program. Do I need to worry in Python about speed and memory passing class all the time?
CodePudding user response:
Prefer the 1st form, for three reasons.
- You're not shadowing the
type
builtin. (Trivial, could use alternate spellingtype_
) - More convenient for the caller, and for folks reading the calling code.
- Those three things go together. Better to show that, with the representation.
When we speak of (model, type, version)
,
they could be nearly anything.
There's no clear relationship among them,
and no name to hang documentation upon.
OTOH the object may have well-understood constraints, perhaps "model is never Edsel when version > 3". We can consult the documentation, and the implementation, to understand the class invariants.
Sometimes mutability is a concern.
That is, a caller might have passed in an
object with foo(test)
, and then we're
worried that library routine foo
might possibly have
changed model "Colt" to "Bronco".
Often the docs, implicit or explicit, will make clear that such mutations are out of bounds, they will not happen. To make things very obvious with minimal documentation burden, consider using a named tuple for those three fields in the example.
need to worry in Python about speed and memory passing class all the time?
No.
Python is about clarity of communicating a technical idea to other humans. It is not about speed.
Recall Knuth's advice. If speed was a principal concern, you would have already used cProfile to identify the hot spots that should be implemented in e.g. Rust, cython, or C . Usually that only becomes important when you notice you're often looping more than a thousand or a million times.
Use dis.dis()
to disassemble your two functions.
Notice that caller1 pushed a single reference
to test
, while caller2 spent more time and
more stack memory pushing three references.
Down in the target code, we still need to
chase three references, so that's mostly a wash.
If you pass an object with a dozen attributes, of which just three will be used, that's no burden on the bytecode interpreter, the other nine are simply never touched. It can be an intellectual burden on an engineer maintaining the code, who might need to reason about those nine and dismiss them as not a concern.
Another concern that a paranoid caller might have
about called library code relates to references.
Typically we expect the called routine will not
permanently hold a reference (or weakref) on
the passed test
object, nor on attributes
such as test.version
or test.version.history_dict
.
If the library routine will store a reference for a
long time, or pass a reference to someone that will
store it, well, that's worth documenting.
Caller will want to understand memory consumption,
leaks, and object lifetime.
CodePudding user response:
- I think that better practice to use is
get_type_1
. It is more self-contained and easier to read and understand. When you pass the entireTest
object to the function, you can see all of the relevant information in one place, rather than having to look at the function definition to see what arguments(model, type, version)
are being used. This can make it easier to understand the code as for my opinion. Also when you are using the first methodget_type_1
, you can define the function to only accept certain types of objects as arguments. This can help to ensure that the function is only called with valid arguments and can help to prevent errors. For example, you could define the function to only acceptTest
objects as arguments, like this:
def get_type_1(test: Test) -> str:
if test.model == "smth" and test.type == "smth" and type.version == "smth":
return "val"
In terms of clean code it is generally considered a good idea to use the most self-contained and easy-to-understand is
get_type_1
as I've described before.I guess there is usually no significant difference in terms of speed or memory usage. The speed of a function in Python is generally not affected by whether you pass an entire object or individual arguments to the function. This is because Python is a high-level language and the Python interpreter handles the details of how arguments are passed to functions and how objects are stored in memory