Skip to content

Commit c57765b

Browse files
author
Steve Canny
committed
docs: document breaks analysis
1 parent d490871 commit c57765b

3 files changed

Lines changed: 442 additions & 0 deletions

File tree

Lines changed: 269 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,269 @@
1+
2+
Breaks
3+
======
4+
5+
Word supports a variety of breaks that interrupt the flow of text in the
6+
document:
7+
8+
* line break
9+
* page break
10+
* column break
11+
* section break (new page, even page, odd page)
12+
13+
In addition, a page break can be forced by formatting a paragraph with the
14+
"page break before" setting.
15+
16+
This analysis is limited to line, page, and column breaks. A section break is
17+
implemented using a completely different set of elements and is covered
18+
separately.
19+
20+
21+
Candidate protocol -- run.add_break()
22+
-------------------------------------
23+
24+
The following interactive session demonstrates the protocol for adding a page
25+
break::
26+
27+
>>> run = p.add_run()
28+
>>> run.breaks
29+
[]
30+
31+
>>> run.add_break() # by default adds WD_BREAK.LINE
32+
>>> run.breaks
33+
[<docx.text.Break object at 0x10a7c4f50>]
34+
>>> run.breaks[0].type.__name__
35+
WD_BREAK.LINE
36+
37+
>>> run.add_break(WD_BREAK.LINE)
38+
>>> run.breaks
39+
[<docx.text.Break object at 0x10a7c4f50>, <docx.text.Break object at 0x10a7c4f58>]
40+
41+
>>> run.add_break(WD_BREAK.PAGE)
42+
>>> run.add_break(WD_BREAK.COLUMN)
43+
>>> run.add_break(WD_BREAK.LINE_CLEAR_LEFT)
44+
>>> run.add_break(WD_BREAK.LINE_CLEAR_RIGHT)
45+
>>> run.add_break(WD_BREAK.TEXT_WRAPPING)
46+
47+
48+
Enumeration -- WD_BREAK_TYPE
49+
----------------------------
50+
51+
* WD_BREAK.LINE
52+
* WD_BREAK.LINE_CLEAR_LEFT
53+
* WD_BREAK.LINE_CLEAR_RIGHT
54+
* WD_BREAK.TEXT_WRAPPING (e.g. LINE_CLEAR_ALL)
55+
56+
* WD_BREAK.PAGE
57+
58+
* WD_BREAK.COLUMN
59+
60+
* WD_BREAK.SECTION_NEXT_PAGE
61+
* WD_BREAK.SECTION_CONTINUOUS
62+
* WD_BREAK.SECTION_EVEN_PAGE
63+
* WD_BREAK.SECTION_ODD_PAGE
64+
65+
66+
Specimen XML
67+
------------
68+
69+
.. highlight:: xml
70+
71+
72+
Line break
73+
~~~~~~~~~~
74+
75+
This XML is produced by Word after inserting a line feed with Shift-Enter::
76+
77+
<w:p>
78+
<w:r>
79+
<w:t>Text before</w:t>
80+
</w:r>
81+
<w:r>
82+
<w:br/>
83+
<w:t>and after line break</w:t>
84+
</w:r>
85+
</w:p>
86+
87+
Word loads this more straightforward generation just fine, although it changes
88+
it back on next save. I'm not sure of the advantage in creating a fresh run
89+
such that the ``<w:br/>`` element is the first child::
90+
91+
<w:p>
92+
<w:r>
93+
<w:t>Text before</w:t>
94+
<w:br/>
95+
<w:t>and after line break</w:t>
96+
</w:r>
97+
</w:p>
98+
99+
100+
Page break
101+
~~~~~~~~~~
102+
103+
Starting with this XML ... ::
104+
105+
<w:p>
106+
<w:r>
107+
<w:t>Before inserting a page break, the cursor was here }</w:t>
108+
</w:r>
109+
</w:p>
110+
<w:p>
111+
<w:r>
112+
<w:t>This was the following paragraph, the last in the document</w:t>
113+
</w:r>
114+
</w:p>
115+
116+
117+
... this XML is produced by Word on inserting a hard page::
118+
119+
<w:p>
120+
<w:r>
121+
<w:t>Before inserting a page break, the cursor was here }</w:t>
122+
</w:r>
123+
</w:p>
124+
<w:p>
125+
<w:r>
126+
<w:br w:type="page"/>
127+
</w:r>
128+
</w:p>
129+
<w:p>
130+
<w:bookmarkStart w:id="0" w:name="_GoBack"/>
131+
<w:bookmarkEnd w:id="0"/>
132+
</w:p>
133+
<w:p>
134+
<w:r>
135+
<w:t>This was the following paragraph, the last in the document</w:t>
136+
</w:r>
137+
</w:p>
138+
139+
Word loads the following simplified form fine ... ::
140+
141+
<w:p>
142+
<w:r>
143+
<w:t>Text before an intra-run page break</w:t>
144+
<w:br w:type="page"/>
145+
<w:t>Text after an intra-run page break</w:t>
146+
</w:r>
147+
</w:p>
148+
<w:p>
149+
<w:r>
150+
<w:t>following paragraph</w:t>
151+
</w:r>
152+
</w:p>
153+
154+
... although on saving it converts it to this::
155+
156+
<w:p>
157+
<w:r>
158+
<w:t>Text before an intra-run page break</w:t>
159+
</w:r>
160+
<w:r>
161+
<w:br w:type="page"/>
162+
</w:r>
163+
<w:r>
164+
<w:lastRenderedPageBreak/>
165+
<w:t>Text after an intra-run page break</w:t>
166+
</w:r>
167+
</w:p>
168+
<w:p>
169+
<w:r>
170+
<w:t>following paragraph</w:t>
171+
</w:r>
172+
</w:p>
173+
174+
175+
Schema excerpt
176+
--------------
177+
178+
.. highlight:: xml
179+
180+
::
181+
182+
<xsd:complexType name="CT_R">
183+
<xsd:sequence>
184+
<xsd:group ref="EG_RPr" minOccurs="0"/>
185+
<xsd:group ref="EG_RunInnerContent" minOccurs="0" maxOccurs="unbounded"/>
186+
</xsd:sequence>
187+
<xsd:attribute name="rsidRPr" type="ST_LongHexNumber"/>
188+
<xsd:attribute name="rsidDel" type="ST_LongHexNumber"/>
189+
<xsd:attribute name="rsidR" type="ST_LongHexNumber"/>
190+
</xsd:complexType>
191+
192+
<xsd:group name="EG_RunInnerContent">
193+
<xsd:choice>
194+
<xsd:element name="br" type="CT_Br"/>
195+
<xsd:element name="t" type="CT_Text"/>
196+
<xsd:element name="contentPart" type="CT_Rel"/>
197+
<xsd:element name="delText" type="CT_Text"/>
198+
<xsd:element name="instrText" type="CT_Text"/>
199+
<xsd:element name="delInstrText" type="CT_Text"/>
200+
<xsd:element name="noBreakHyphen" type="CT_Empty"/>
201+
<xsd:element name="softHyphen" type="CT_Empty"/>
202+
<xsd:element name="dayShort" type="CT_Empty"/>
203+
<xsd:element name="monthShort" type="CT_Empty"/>
204+
<xsd:element name="yearShort" type="CT_Empty"/>
205+
<xsd:element name="dayLong" type="CT_Empty"/>
206+
<xsd:element name="monthLong" type="CT_Empty"/>
207+
<xsd:element name="yearLong" type="CT_Empty"/>
208+
<xsd:element name="annotationRef" type="CT_Empty"/>
209+
<xsd:element name="footnoteRef" type="CT_Empty"/>
210+
<xsd:element name="endnoteRef" type="CT_Empty"/>
211+
<xsd:element name="separator" type="CT_Empty"/>
212+
<xsd:element name="continuationSeparator" type="CT_Empty"/>
213+
<xsd:element name="sym" type="CT_Sym"/>
214+
<xsd:element name="pgNum" type="CT_Empty"/>
215+
<xsd:element name="cr" type="CT_Empty"/>
216+
<xsd:element name="tab" type="CT_Empty"/>
217+
<xsd:element name="object" type="CT_Object"/>
218+
<xsd:element name="pict" type="CT_Picture"/>
219+
<xsd:element name="fldChar" type="CT_FldChar"/>
220+
<xsd:element name="ruby" type="CT_Ruby"/>
221+
<xsd:element name="footnoteReference" type="CT_FtnEdnRef"/>
222+
<xsd:element name="endnoteReference" type="CT_FtnEdnRef"/>
223+
<xsd:element name="commentReference" type="CT_Markup"/>
224+
<xsd:element name="drawing" type="CT_Drawing"/>
225+
<xsd:element name="ptab" type="CT_PTab"/>
226+
<xsd:element name="lastRenderedPageBreak" type="CT_Empty"/>
227+
</xsd:choice>
228+
</xsd:group>
229+
230+
<xsd:complexType name="CT_Br">
231+
<xsd:attribute name="type" type="ST_BrType"/>
232+
<xsd:attribute name="clear" type="ST_BrClear"/>
233+
</xsd:complexType>
234+
235+
<xsd:simpleType name="ST_BrType">
236+
<xsd:restriction base="xsd:string">
237+
<xsd:enumeration value="page"/>
238+
<xsd:enumeration value="column"/>
239+
<xsd:enumeration value="textWrapping"/>
240+
</xsd:restriction>
241+
</xsd:simpleType>
242+
243+
<xsd:simpleType name="ST_BrClear">
244+
<xsd:restriction base="xsd:string">
245+
<xsd:enumeration value="none"/>
246+
<xsd:enumeration value="left"/>
247+
<xsd:enumeration value="right"/>
248+
<xsd:enumeration value="all"/>
249+
</xsd:restriction>
250+
</xsd:simpleType>
251+
252+
253+
Resources
254+
---------
255+
256+
* `WdBreakType Enumeration on MSDN`_
257+
* `Range.InsertBreak Method (Word) on MSDN`_
258+
259+
.. _WdBreakType Enumeration on MSDN:
260+
http://msdn.microsoft.com/en-us/library/office/ff195905.aspx
261+
262+
.. _Range.InsertBreak Method (Word) on MSDN:
263+
http://msdn.microsoft.com/en-us/library/office/ff835132.aspx
264+
265+
266+
Relevant sections in the ISO Spec
267+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
268+
269+
* 17.18.3 ST_BrClear (Line Break Text Wrapping Restart Location)

0 commit comments

Comments
 (0)