@@ -7,7 +7,7 @@ Copy Management
77.. warning:: `postgresql.copyman` is a new feature in v1.0.
88
99The `postgresql.copyman` module provides a way to quickly move COPY data coming
10- from one connection to many connections. Alternatively, it can also be sourced
10+ from one connection to many connections. Alternatively, it can be sourced
1111by arbitrary iterators and target arbitrary callables.
1212
1313Statement execution methods offer a way for running COPY operations
@@ -37,13 +37,13 @@ The `postgresql.copyman.CopyManager` class manages the Producer and the
3737Receivers involved in a COPY operation. Normally,
3838`postgresql.copyman.StatementProducer` and
3939`postgresql.copyman.StatementReceiver` instances. Naturally, a Producer is the
40- object that produces the COPY data to be given to the manager 's Receivers.
40+ object that produces the COPY data to be given to the Manager 's Receivers.
4141
42- Using a CopyManager directly means that there is a need for more control over
42+ Using a Manager directly means that there is a need for more control over
4343the operation. The Manager is both a context manager and an iterator. The
44- context manager interfaces handle initialization and finalization, and the
45- iterator provides an event loop emitting information about the amount of
46- COPY data copied this cycle. Normal usage takes the form::
44+ context manager interfaces handle initialization and finalization of the COPY
45+ state, and the iterator provides an event loop emitting information about the
46+ amount of COPY data transferred this cycle. Normal usage takes the form::
4747
4848 >>> from postgresql import copyman
4949 >>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
@@ -57,15 +57,14 @@ COPY data copied this cycle. Normal usage takes the form::
5757 ... for num_messages, num_bytes in copy:
5858 ... update_rate(num_bytes)
5959
60- The use of the context manager is necessary for ensuring that connection state
61- is properly restored at the end of the COPY.
62-
6360As an alternative to a for-loop inside a with-statement block, the `run` method
6461can be called to perform the operation::
6562
6663 >>> with source.xact(), destination.xact():
6764 ... copyman.CopyManager(producer, receiver).run()
6865
66+ However, there is little benefit beyond using the high-level
67+ `postgresql.copyman.transfer` function.
6968
7069Manager Interface Points
7170------------------------
@@ -82,30 +81,54 @@ an iterator for controlling the COPY operation.
8281
8382 ``CopyManager.__exit__(typ, val, tb)``
8483 Finish the COPY operation. Fails in the case of an incomplete
85- COPY, or an untrapped exception.
84+ COPY, or an untrapped exception. Either returns `None` or raises the generalized
85+ exception, `postgresql.copyman.CopyFail`.
8686
8787 ``CopyManager.__iter__()``
8888 Returns the CopyManager instance.
8989
9090 ``CopyManager.__next__()``
9191 Transfer the next chunk of COPY data to the receivers. Yields a tuple
92- consisting of the number of messages and bytes transferred. Raises
93- `StopIteration` when complete.
92+ consisting of the number of messages and bytes transferred,
93+ ``(num_messages, num_bytes)``. Raises `StopIteration` when complete.
94+
95+ Raises `postgresql.copyman.ReceiverFault` when a Receiver raises an
96+ exception.
97+ Raises `postgresql.copyman.ProducerFault` when the Producer raises an
98+ exception. The original exception is available via the exception's
99+ ``__context__`` attribute.
94100
95101 ``CopyManager.reconcile(faulted_receiver)``
96102 Reconcile a faulted receiver. When a receiver faults, it will no longer
97- be in the receiver set. This method is used to signal to the manager that the
98- problem has been cleared up, and the receiver is again ready to receive.
103+ be in the set of Receivers. This method is used to signal to the manager that the
104+ problem has been corrected, and the receiver is again ready to receive.
105+
106+ ``CopyManager.receivers``
107+ The `builtins.set` of Receivers involved in the COPY operation.
108+
109+ ``CopyManager.producer``
110+ The Producer emitting the data to be given to the Receivers.
99111
100112
101113Faults
102114======
103115
104- The CopyManager generalizes some exceptions that occur during transfer. While
116+ The CopyManager generalizes any exceptions that occur during transfer. While
105117inside the context manager, `postgresql.copyman.Fault` may be raised if a
106- Receiver raises an exception. The Manager assumes the Fault is fatal to a
107- Receiver, and immediately removes it from the set of target receivers.
108- Additionally, if the Fault goes untrapped, the copy will fail.
118+ Receiver or a Producer raises an exception. A `postgresql.copyman.ProducerFault`
119+ in the case of the Producer, and `postgresql.copyman.ReceiverFault` in the case
120+ of the Receivers.
121+
122+ .. note::
123+ Faults are only raised by `postgresql.copyman.CopyManager.__next__`. The
124+ ``run()`` method will always raise `postgresql.copyman.CopyFail`.
125+
126+ Receiver Faults
127+ ---------------
128+
129+ The Manager assumes the Fault is fatal to a Receiver, and immediately removes
130+ it from the set of target receivers. Additionally, if the Fault goes untrapped,
131+ the copy will ultimately fail.
109132
110133The Fault exception references the Manager that raised the exception, and the
111134actual exceptions that occurred, associated with the Receiver that caused them::
@@ -123,36 +146,38 @@ actual exceptions that occurred, associated with the Receiver that caused them::
123146 ... try:
124147 ... for num_messages, num_bytes in copy:
125148 ... update_rate(num_bytes)
126- ... except copyman.Fault as cf:
149+ ... except copyman.ReceiverFault as cf:
150+ ... # Access the original exception using the receiver as the key.
127151 ... original_exception = cf.faults[receiver]
128152 ... if unknown_failure(original_exception):
129153 ... ...
130154 ... raise
131155
132156
133- Fault Properties
134- ----------------
157+ ReceiverFault Properties
158+ ~~~~~~~~~~~~~~~~~~~~~~~~
135159
136- The following attributes exist on `postgresql.copyman.Fault ` instances:
160+ The following attributes exist on `postgresql.copyman.ReceiverFault ` instances:
137161
138- ``Fault.manager``
139- The `postgresql.copyman.CopyManager` instance that raised the exception; the
140- same manager that caught the fault.
162+ ``ReceiverFault.manager``
163+ The subject `postgresql.copyman.CopyManager` instance.
141164
142- ``Fault .faults``
143- A dictionary mapping the Receiver to the exception that occurred. The Manager
144- will give processing to every Receiver, so only one Fault will occur per
145- transfer cycle.
165+ ``ReceiverFault .faults``
166+ A dictionary mapping the Receiver to the exception raised by that Receiver.
167+ The Manager will give processing time to every Receiver, so only * one* Fault will
168+ occur per transfer cycle, each iteration .
146169
147- Reconciliation
148- --------------
149170
150- When a Fault occurs, it is possible that it was not fatal. In such cases the
151- `postgresql.copyman.CopyManager.reconcile` method can be used to reintroduce the
152- Receiver to the Manager's set. That is, when a Fault occurs, the Manager
153- immediately removes the Receiver so that the COPY operation can continue.
171+ Reconciliation
172+ ~~~~~~~~~~~~~~
154173
155- Faults should be trapped from within the Manager's context::
174+ When a `postgresql.copyman.ReceiverFault` is raised, the Manager immediately
175+ removes the Receiver so that the COPY operation can continue. Continuation of
176+ the COPY can occur by trapping the exception and continuing the iteration of the
177+ Manager. However, if the fault is recoverable, the
178+ `postgresql.copyman.CopyManager.reconcile` method must be used to reintroduce the
179+ Receiver into the Manager's set. Faults should be trapped from within the
180+ Manager's context::
156181
157182 >>> import socket
158183 >>> from postgresql import copyman
@@ -168,7 +193,7 @@ Faults should be trapped from within the Manager's context::
168193 ... try:
169194 ... for num_messages, num_bytes in copy:
170195 ... update_rate(num_bytes)
171- ... except copyman.Fault as cf:
196+ ... except copyman.ReceiverFault as cf:
172197 ... if isinstance(cf.faults[receiver], socket.timeout):
173198 ... copy.reconcile(receiver)
174199 ... else:
@@ -179,6 +204,82 @@ so, often, it's best to avoid conditions in which reconciliable Faults may
179204occur.
180205
181206
207+ Producer Faults
208+ ---------------
209+
210+ Producer faults are normally fatal to the COPY operation and should rarely be
211+ trapped. However, the Manager makes no state changes when a Producer faults,
212+ so, unlike Receiver Faults, no reconciliation process is necessary; rather,
213+ if it's safe to continue, the Manager's iterator should continue to be
214+ processed.
215+
216+ ProducerFault Properties
217+ ~~~~~~~~~~~~~~~~~~~~~~~~
218+
219+ The following attributes exist on `postgresql.copyman.ProducerFault` instances:
220+
221+ ``ReceiverFault.manager``
222+ The subject `postgresql.copyman.CopyManager`.
223+
224+ ``ReceiverFault.__context__``
225+ The original exception raised by the Producer.
226+
227+
228+ Failures
229+ ========
230+
231+ When a COPY operation is aborted, either by an exception or by the iterator
232+ being broken, a `postgresql.copyman.CopyFail` exception will be raised,
233+ generalizing the failure. When a failure occurs, the Manager will *attempt* to
234+ recover and realign the Producer and the Receivers. Regardless of the success of
235+ the recovery process, a `postgresql.copyman.CopyFail` exception will be raised.
236+
237+ The `postgresql.copyman.CopyFail` offers to record any exceptions that occur
238+ during the exit of the context manager.
239+
240+
241+ CopyFail Properties
242+ -------------------
243+
244+ The following properties exist on `postgresql.copyman.CopyFail` exceptions:
245+
246+ ``CopyFail.manager``
247+ The Manager whose COPY operation failed.
248+
249+ ``CopyFail.receiver_faults``
250+ A dictionary mapping a `postgresql.copyman.Receiver` to the exception raised
251+ by that Receiver's ``__exit__``. `None` if no exceptions were raised by the
252+ Receivers.
253+
254+ ``CopyFail.producer_fault``
255+ The exception Raised by the `postgresql.copyman.Producer`. `None` if none.
256+
257+
258+ Producers
259+ =========
260+
261+ The following Producers are available:
262+
263+ ``postgresql.copyman.StatementProducer(postgresql.api.Statement)``
264+ Given a Statement producing COPY data, construct a Producer.
265+
266+ ``postgresql.copyman.IteratorProducer(collections.Iterator)``
267+ Given an Iterator producing *chunks* of COPY lines, construct a Producer to
268+ manage the data coming from the iterator.
269+
270+
271+ Receivers
272+ =========
273+
274+ ``postgresql.copyman.StatementReceiver(postgresql.api.Statement)``
275+ Given a Statement producing COPY data, construct a Producer.
276+
277+ ``postgresql.copyman.CallReceiver(callable)``
278+ Given a callable, construct a Receiver that will transmit COPY data in chunks
279+ of lines. That is, the callable will be given a list of COPY lines for each
280+ transfer cycle.
281+
282+
182283Terminology
183284===========
184285
@@ -207,13 +308,13 @@ processes of the `postgresql.copyman` module:
207308 necessary steps for a Receiver's reintroduction into the COPY operation after
208309 a Fault.
209310
210- Realignment
211- The process of providing compensating data to the receivers so that the
212- connection will be on a message boundary. Occurs when the COPY operation
213- fails.
214-
215311 Failed Copy
216312 A failed copy is an aborted COPY operation. This occurs in situations of
217313 untrapped exceptions or an incomplete COPY. Specifically, the COPY will be
218314 noted as failed in cases where the Manager's iterator is *not* ran until
219315 exhaustion.
316+
317+ Realignment
318+ The process of providing compensating data to the receivers so that the
319+ connection will be on a message boundary. Occurs when the COPY operation
320+ fails.
0 commit comments