I'm trying to understand the behaviour of the pickle
module.
It seems to me that pickle
doesn't save methods, only attributes.
Here is what I mean with a code:
import pickle
class Person:
def __init__(self, name):
self.name = name
def add_dot(self):
self.name = '.'
etienne = Person('etienne')
pickled = pickle.dumps(etienne)
class Person:
def __init__(self, name):
self.name = name
def add_dot(self):
self.name = '..'
etienne = pickle.loads(pickled)
etienne.add_dot()
What would you expect this code to return?
Is someone able to explain me this behaviour?
CodePudding user response:
Have you tried it? Adding the line print(etienne.name)
and running it returns "etienne..", but the better question is why?
The important thing to remember when pickling a class instance is that a class instance is really made up of two things: the instance variables (call myClass.__dict__
to see what those look like) and the source code that tells python how you want those data interpreted. When you call a method on an instance of a class (like etienne.add_dot()
), it gets translated into a basic function call to the method in the class, while passing in the instance as the argument for self
(so etienne.add_dot()
becomes Person(etienne)
). This scheme is widely used for OOP, as it helps save loads of space when you are running programs with a lot of instances of a class, since each one can just store the instance variables and refer to the same source code when they have a method called on them.
As you have noticed, the pickle
module uses the same exact scheme when pickling classes, only saving instance variables and not the source code for the class itself. From the pickle documentation:
This is done on purpose, so you can fix bugs in a class or add methods to the class and still load objects that were created with an earlier version of the class.
CodePudding user response:
The short answer is no, you cannot pickle methods but you can pickle functions (built-in and user-defined) accessible from the top level of a module (using def, not lambda).
The Python documentation states the types that can be pickled and unpickled are:
- None, True, and False;
- integers, floating-point numbers, complex numbers;
- strings, bytes, bytearrays;
- tuples, lists, sets, and dictionaries containing only picklable objects;
- functions (built-in and user-defined) accessible from the top level of a module (using def, not lambda);
- classes accessible from the top level of a module;
- instances of such classes whose dict or the result of calling getstate() is picklable (see section Pickling Class Instances for details).
As per Wikipedia definition, serialization is:
the process of translating a data structure or object state into a format that can be stored (for example, in a file or memory data buffer) or transmitted (for example, over a computer network) and reconstructed later (possibly in a different computer environment).
The Pickle module simply wasn't built to serialize methods. The main purpose is to serialize state, the state of the attributes. The object is then instantiated when unpickling. As the method is part of the class definition, your code works fine but only with the later definition of the class. Thus etienne.name
value end up being etienne...
See also this other thread (Pickling a staticmethod in Python).