I'm curious how the 'self' works in the background in python
how is the 'self' connected to the instance in the background when i create it ?
i know when i create the instance the class will be called and the __init__
method is going to take the instance variable name and assign it to the self but how is this done in the background i am very curious can anyone explain ?
class cats:
def __init__(self):
self.name = 'cat'
animal = cats()
print(animal.name)
CodePudding user response:
In Python, the word self is the first parameter of methods that represents the instance of the class. Just like this
in Java or C .
Technically both self and this are used for the same thing. They are used to access the variable associated with the current instance. The only difference is, you have to include self explicitly as first parameter to an instance method in Python, whereas this is not the case with Java.
In object-oriented programming, whenever we define methods for a class, we use self as the first parameter in each case. And self must be provided as a First parameter to the Instance method and constructor. If you don’t provide it, it will cause an error.
CodePudding user response:
The exact details of how classes are implemented are implementation specific. For instance, CPython, Cython, and PyPy may have different ways of implementing classes. Here, I will cover how classes are implemented in CPython. Some things may also be a bit more complex in practice, but I think the general story outlined below is correct.
For this answer to make sense, you need to be aware that the default implementation of Python (called CPython) is interpreted. Your Python code is converted to Python byte code, which is then executed by the interpreter. The interpreter is written in C, and it represents types and objects using struct
s. In case you are unfamilliar with structs, you might want to look them up beforehand.
The state of an object is stored in two places: in fields on the struct, and in the so-called instance dictionary.
Some "internal" attribute, such as __class__
, are stored directly in the struct itself (e.g. in the form of the ob_type
field for __class__
). These fields stored on the struct are often fields containing metadata about the object, for some use in the interpreter.
all user-defined attributes/data are stored inside the instance dictionary, which is a field in the struct called ob_dict
. In Python Code, you can access this dictionary using the syntax instance.__dict__
.
All attributes of a Python object can be viewed using the dir
function. For instance, suppose that we define a class C
by class C: pass
. dir(C())
returns the following:
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__']
But if you look at C().__dict__
, you get {}
, which is the empty dictionary. Hence, all fields above are stored directly in the structs fields.
Now, suppose that we assign c = C()
. Next, do c.attr = 10
. If we now look at c.__dict__
, we get {'attr': 10}
. As you can see, the attribute you defined is now stored in the instance dictionary. So, under the hood, Python retrieves the instance dictionary from the struct, and sets c.__dict['attr'] = 10
. If you were to call dir(c)
, you would see that attr
is now also present in the output returned by dir
.
In fact, this way of assigning attributes is not any different than an assignment inside the __init__
function. Suppose that we have the following class:
class Counter:
def __init__(self, x: int):
self.value = x
def add(self, y: int):
self.value = y
def get(self):
return self.value
If we create an instance using counter_instance = Counter(0)
, this is (roughly) equivalent to:
counter_instance = Counter.__new__(Counter, 0)
type(counter_instance).__init__(counter_instance, 0)
So, when __init__
is called, it is simply passed the instance we are also using in our code. So, there is nothing really special about self
-- it is simply the instance. There is also nothing special about __init__
: anything you can do inside __init__
, you can do outside it too. In fact, you could create a counter manually using counter = Counter.__new__(Counter, 0)
, and then intialize it manually using counter.value = 0
. And, as explained above, this assignment is internally performed by storing data in the instance dictionary. When reading the value using print(counter.value)
, the value is then retrieved from the instance dictionary.