Skip to content

Commit a3f822b

Browse files
author
Ralf W. Grosse-Kunstleve
committed
Documentation for pickle support.
[SVN r9417]
1 parent afdaa4d commit a3f822b

File tree

2 files changed

+224
-1
lines changed

2 files changed

+224
-1
lines changed

doc/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ <h2>Table of Contents</h2>
116116
<li>Advanced Topics
117117

118118
<ol>
119-
<li>Pickling
119+
<li><a href="pickle.html">Pickle Support</a>
120120

121121
<li>class_builder&lt;&gt;
122122

doc/pickle.html

Lines changed: 223 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,223 @@
1+
<html>
2+
<head>
3+
<title>BPL Pickle Support</title>
4+
</head>
5+
<body>
6+
7+
<img src="../../../c++boost.gif"
8+
alt="c++boost.gif (8819 bytes)"
9+
align="center"
10+
width="277" height="86">
11+
12+
</body>
13+
<hr>
14+
<h1>BPL Pickle Support</h1>
15+
16+
Pickle is a Python module for object serialization, also known
17+
as persistence, marshalling, or flattening.
18+
19+
<p>
20+
It is often necessary to save and restore the contents of an object to
21+
a file. One approach to this problem is to write a pair of functions
22+
that read and write data from a file in a special format. A powerful
23+
alternative approach is to use Python's pickle module. Exploiting
24+
Python's ability for introspection, the pickle module recursively
25+
converts nearly arbitrary Python objects into a stream of bytes that
26+
can be written to a file.
27+
28+
<p>
29+
The Boost Python Library supports the pickle module by emulating the
30+
interface implemented by Jim Fulton's ExtensionClass module that is
31+
included in the ZOPE distribution
32+
(<a href="http://www.zope.org/">http://www.zope.org/</a>).
33+
This interface is similar to that for regular Python classes as
34+
described in detail in the Python Library Reference for pickle:
35+
36+
<blockquote>
37+
<a href="http://www.python.org/doc/current/lib/module-pickle.html"
38+
>http://www.python.org/doc/current/lib/module-pickle.html</a>
39+
</blockquote>
40+
41+
<hr>
42+
<h1>The BPL Pickle Interface</h1>
43+
44+
At the user level, the BPL pickle interface involves three special
45+
methods:
46+
47+
<dl>
48+
<dt>
49+
<strong>__getinitargs__</strong>
50+
<dd>
51+
When an instance of a BPL extension class is pickled, the pickler
52+
tests if the instance has a __getinitargs__ method. This method must
53+
return a Python tuple. When the instance is restored by the
54+
unpickler, the contents of this tuple are used as the arguments for
55+
the class constructor.
56+
57+
<p>
58+
If __getinitargs__ is not defined, the class constructor will be
59+
called without arguments.
60+
61+
<p>
62+
<dt>
63+
<strong>__getstate__</strong>
64+
65+
<dd>
66+
When an instance of a BPL extension class is pickled, the pickler
67+
tests if the instance has a __getstate__ method. This method should
68+
return a Python object representing the state of the instance.
69+
70+
<p>
71+
If __getstate__ is not defined, the instance's __dict__ is pickled
72+
(if it is not empty).
73+
74+
<p>
75+
<dt>
76+
<strong>__setstate__</strong>
77+
78+
<dd>
79+
When an instance of a BPL extension class is restored by the
80+
unpickler, it is first constructed using the result of
81+
__getinitargs__ as arguments (see above). Subsequently the unpickler
82+
tests if the new instance has a __setstate__ method. If so, this
83+
method is called with the result of __getstate__ (a Python object) as
84+
the argument.
85+
86+
<p>
87+
If __setstate__ is not defined, the result of __getstate__ must be
88+
a Python dictionary. The items of this dictionary are added to
89+
the instance's __dict__.
90+
</dl>
91+
92+
If both __getstate__ and __setstate__ are defined, the Python object
93+
returned by __getstate__ need not be a dictionary. The __getstate__ and
94+
__setstate__ methods can do what they want.
95+
96+
<hr>
97+
<h1>Pitfalls and Safety Guards</h1>
98+
99+
In BPL extension modules with many extension classes, providing
100+
complete pickle support for all classes would be a significant
101+
overhead. In general complete pickle support should only be implemented
102+
for extension classes that will eventually be pickled. However, the
103+
author of a BPL extension module might not anticipate correctly which
104+
classes need support for pickle. Unfortunately, the pickle protocol
105+
described above has two important pitfalls that the end user of a BPL
106+
extension module might not be aware of:
107+
108+
<dl>
109+
<dt>
110+
<strong>Pitfall 1:</strong>
111+
Both __getinitargs__ and __getstate__ are not defined.
112+
113+
<dd>
114+
In this situation the unpickler calls the class constructor without
115+
arguments and then adds the __dict__ that was pickled by default to
116+
that of the new instance.
117+
118+
<p>
119+
However, most C++ classes wrapped with the BPL will have member data
120+
that are not restored correctly by this procedure. To alert the user
121+
to this problem, a safety guard is provided. If both __getinitargs__
122+
and __getstate__ are not defined, the BPL tests if the class has an
123+
attribute __dict_defines_state__. An exception is raised if this
124+
attribute is not defined:
125+
126+
<pre>
127+
RuntimeError: Incomplete pickle support (__dict_defines_state__ not set)
128+
</pre>
129+
130+
In the rare cases where this is not the desired behavior, the safety
131+
guard can deliberately be disabled. The corresponding C++ code for
132+
this is, e.g.:
133+
134+
<pre>
135+
class_builder<your_class> py_your_class(your_module, "your_class");
136+
py_your_class.dict_defines_state();
137+
</pre>
138+
139+
It is also possible to override the safety guard at the Python level.
140+
E.g.:
141+
142+
<pre>
143+
import your_bpl_module
144+
class your_class(your_bpl_module.your_class):
145+
__dict_defines_state__ = 1
146+
</pre>
147+
148+
<p>
149+
<dt>
150+
<strong>Pitfall 2:</strong>
151+
__getstate__ is defined and the instance's __dict__ is not empty.
152+
153+
<dd>
154+
The author of a BPL extension class might provide a __getstate__
155+
method without considering the possibilities that:
156+
157+
<p>
158+
<ul>
159+
<li>
160+
his class is used as a base class. Most likely the __dict__ of
161+
instances of the derived class needs to be pickled in order to
162+
restore the instances correctly.
163+
164+
<p>
165+
<li>
166+
the user adds items to the instance's __dict__ directly. Again,
167+
the __dict__ of the instance then needs to be pickled.
168+
</ul>
169+
<p>
170+
171+
To alert the user to this highly unobvious problem, a safety guard is
172+
provided. If __getstate__ is defined and the instance's __dict__ is
173+
not empty, the BPL tests if the class has an attribute
174+
__getstate_manages_dict__. An exception is raised if this attribute
175+
is not defined:
176+
177+
<pre>
178+
RuntimeError: Incomplete pickle support (__getstate_manages_dict__ not set)
179+
</pre>
180+
181+
To resolve this problem, it should first be established that the
182+
__getstate__ and __setstate__ methods manage the instances's __dict__
183+
correctly. Note that this can be done both at the C++ and the Python
184+
level. Finally, the safety guard should intentionally be overridden.
185+
E.g. in C++:
186+
187+
<pre>
188+
class_builder<your_class> py_your_class(your_module, "your_class");
189+
py_your_class.getstate_manages_dict();
190+
</pre>
191+
192+
In Python:
193+
194+
<pre>
195+
import your_bpl_module
196+
class your_class(your_bpl_module.your_class):
197+
__getstate_manages_dict__ = 1
198+
def __getstate__(self):
199+
# your code here
200+
def __setstate__(self, state):
201+
# your code here
202+
</pre>
203+
</dl>
204+
205+
<hr>
206+
<h1>Practical Advice</h1>
207+
208+
<ul>
209+
<li>
210+
Avoid using __getstate__ if the instance can also be reconstructed
211+
by way of __getinitargs__. This automatically avoids Pitfall 2.
212+
213+
<p>
214+
<li>
215+
If __getstate__ is required, include the instance's __dict__ in the
216+
Python object that is returned.
217+
</ul>
218+
219+
<hr>
220+
<address>
221+
Author: Ralf W. Grosse-Kunstleve, March 2001
222+
</address>
223+
</html>

0 commit comments

Comments
 (0)