Skip to content

Commit f5e9e3d

Browse files
committed
Add revision management capability + tests + docs
1 parent e454546 commit f5e9e3d

19 files changed

Lines changed: 1516 additions & 27 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@
88
_scratch/
99
Session.vim
1010
/.tox/
11+
.venv/

docs/api/revisions.rst

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
.. _revisions_api:
2+
3+
Revision-related objects
4+
========================
5+
6+
.. currentmodule:: docx.revision
7+
8+
9+
|TrackedChange| objects
10+
-----------------------
11+
12+
.. autoclass:: TrackedChange()
13+
:members:
14+
:inherited-members:
15+
16+
17+
|TrackedInsertion| objects
18+
--------------------------
19+
20+
.. autoclass:: TrackedInsertion()
21+
:members:
22+
:inherited-members:
23+
24+
25+
|TrackedDeletion| objects
26+
-------------------------
27+
28+
.. autoclass:: TrackedDeletion()
29+
:members:
30+
:inherited-members:

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -390,4 +390,4 @@
390390

391391

392392
# Example configuration for intersphinx: refer to the Python standard library.
393-
intersphinx_mapping = {"http://docs.python.org/3/": None}
393+
intersphinx_mapping = {"python": ("https://docs.python.org/3/", None)}

docs/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ User Guide
8282
user/styles-understanding
8383
user/styles-using
8484
user/comments
85+
user/revisions
8586
user/shapes
8687

8788

@@ -98,6 +99,7 @@ API Documentation
9899
api/table
99100
api/section
100101
api/comments
102+
api/revisions
101103
api/shape
102104
api/dml
103105
api/shared

docs/user/revisions.rst

Lines changed: 215 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,215 @@
1+
.. _revisions:
2+
3+
Working with Tracked Changes (Revisions)
4+
========================================
5+
6+
Word allows *track changes* (also known as *revisions*) to be enabled on a document.
7+
This feature records insertions and deletions made to the document, showing who made
8+
each change and when. This is commonly used for collaborative editing and review
9+
workflows.
10+
11+
When track changes is enabled:
12+
13+
- Inserted text is marked with the ``<w:ins>`` element
14+
- Deleted text is marked with the ``<w:del>`` element
15+
- Each revision records the author, date, and a unique revision ID
16+
17+
.. note::
18+
19+
*python-docx* supports creating and reading tracked changes, as well as accepting
20+
or rejecting individual revisions programmatically.
21+
22+
23+
Enabling Track Changes
24+
----------------------
25+
26+
Track changes mode is controlled via the document settings::
27+
28+
>>> from docx import Document
29+
>>> document = Document()
30+
>>> document.settings.track_revisions = True
31+
>>> document.settings.track_revisions
32+
True
33+
34+
When ``track_revisions`` is ``True``, Word will track any subsequent changes made in
35+
the Word application. Changes made programmatically via *python-docx* must be
36+
explicitly marked as tracked using the methods described below.
37+
38+
39+
Find and Replace with Track Changes
40+
-----------------------------------
41+
42+
The most common use case is performing a find-and-replace operation where the changes
43+
are tracked. The :meth:`.Document.find_and_replace_tracked` method handles this::
44+
45+
>>> document = Document("contract.docx")
46+
>>> document.settings.track_revisions = True
47+
>>> count = document.find_and_replace_tracked(
48+
... search_text="Acme Corp",
49+
... replace_text="NewCo Inc",
50+
... author="Legal Team",
51+
... comment="Company name updated per merger agreement",
52+
... )
53+
>>> print(f"Replaced {count} occurrences")
54+
Replaced 15 occurrences
55+
>>> document.save("contract_revised.docx")
56+
57+
This method:
58+
59+
- Searches all paragraphs in the document body and tables
60+
- Replaces only the specific text (word-level), preserving surrounding formatting
61+
- Creates tracked deletions for the old text and tracked insertions for the new text
62+
- Optionally attaches a comment to each replacement explaining the change
63+
64+
For more control, you can use :meth:`.Paragraph.replace_tracked` on individual
65+
paragraphs::
66+
67+
>>> for paragraph in document.paragraphs:
68+
... if "confidential" in paragraph.text.lower():
69+
... paragraph.replace_tracked("draft", "final", author="Editor")
70+
71+
72+
Adding Tracked Insertions
73+
-------------------------
74+
75+
To add new text as a tracked insertion::
76+
77+
>>> paragraph = document.add_paragraph("This is existing text. ")
78+
>>> tracked = paragraph.add_run_tracked(
79+
... text="This was added later.",
80+
... author="John Smith",
81+
... )
82+
>>> tracked
83+
<docx.revision.TrackedInsertion object at 0x...>
84+
>>> tracked.author
85+
'John Smith'
86+
>>> tracked.text
87+
'This was added later.'
88+
89+
The ``add_run_tracked`` method wraps the new text in a ``<w:ins>`` element, marking
90+
it as inserted content that will appear in Word's track changes view.
91+
92+
93+
Creating Tracked Deletions
94+
--------------------------
95+
96+
To mark existing text as deleted (without actually removing it)::
97+
98+
>>> paragraph = document.add_paragraph("Delete this text please.")
99+
>>> run = paragraph.runs[0]
100+
>>> tracked = run.delete_tracked(author="Editor")
101+
>>> tracked
102+
<docx.revision.TrackedDeletion object at 0x...>
103+
>>> tracked.text
104+
'Delete this text please.'
105+
106+
The text remains in the document but is wrapped in a ``<w:del>`` element. In Word,
107+
this text appears with strikethrough formatting.
108+
109+
110+
Iterating Over Revisions
111+
------------------------
112+
113+
To access tracked changes in a paragraph, use ``iter_inner_content`` with
114+
``include_revisions=True``::
115+
116+
>>> from docx.revision import TrackedInsertion, TrackedDeletion
117+
>>> for item in paragraph.iter_inner_content(include_revisions=True):
118+
... if isinstance(item, TrackedInsertion):
119+
... print(f"INSERTED by {item.author}: {item.text}")
120+
... elif isinstance(item, TrackedDeletion):
121+
... print(f"DELETED by {item.author}: {item.text}")
122+
... else:
123+
... print(f"Normal text: {item.text}")
124+
125+
126+
Accepting and Rejecting Changes
127+
-------------------------------
128+
129+
Individual revisions can be accepted or rejected programmatically::
130+
131+
>>> # Accept an insertion (keeps the inserted text)
132+
>>> tracked_insertion.accept()
133+
134+
>>> # Reject an insertion (removes the inserted text)
135+
>>> tracked_insertion.reject()
136+
137+
>>> # Accept a deletion (removes the deleted text)
138+
>>> tracked_deletion.accept()
139+
140+
>>> # Reject a deletion (restores the deleted text)
141+
>>> tracked_deletion.reject()
142+
143+
144+
TrackedInsertion and TrackedDeletion Properties
145+
-----------------------------------------------
146+
147+
Both ``TrackedInsertion`` and ``TrackedDeletion`` objects provide these properties:
148+
149+
``author``
150+
The name of the author who made the change (read/write).
151+
152+
``date``
153+
The date and time of the change as a ``datetime`` object (read-only).
154+
155+
``revision_id``
156+
The unique identifier for this revision (read/write).
157+
158+
``text``
159+
The text content of the revision (read-only).
160+
161+
``runs``
162+
A list of ``Run`` objects contained in the revision.
163+
164+
``is_run_level``
165+
``True`` if the revision contains runs (inline content).
166+
167+
``is_block_level``
168+
``True`` if the revision contains paragraphs or tables (block content).
169+
170+
171+
Example: Processing a Document with Track Changes
172+
-------------------------------------------------
173+
174+
Here's a complete example that processes an existing document with track changes::
175+
176+
>>> from docx import Document
177+
>>> from docx.revision import TrackedInsertion, TrackedDeletion
178+
179+
>>> document = Document("reviewed_document.docx")
180+
181+
>>> # Count revisions
182+
>>> insertions = 0
183+
>>> deletions = 0
184+
185+
>>> for paragraph in document.paragraphs:
186+
... for item in paragraph.iter_inner_content(include_revisions=True):
187+
... if isinstance(item, TrackedInsertion):
188+
... insertions += 1
189+
... print(f"[+] {item.author}: {item.text[:50]}...")
190+
... elif isinstance(item, TrackedDeletion):
191+
... deletions += 1
192+
... print(f"[-] {item.author}: {item.text[:50]}...")
193+
194+
>>> print(f"\nTotal: {insertions} insertions, {deletions} deletions")
195+
196+
197+
Example: Bulk Accept All Changes
198+
--------------------------------
199+
200+
To accept all tracked changes in a document::
201+
202+
>>> from docx import Document
203+
>>> from docx.revision import TrackedInsertion, TrackedDeletion
204+
205+
>>> document = Document("reviewed_document.docx")
206+
207+
>>> for paragraph in document.paragraphs:
208+
... for item in list(paragraph.iter_inner_content(include_revisions=True)):
209+
... if isinstance(item, (TrackedInsertion, TrackedDeletion)):
210+
... item.accept()
211+
212+
>>> document.save("accepted_document.docx")
213+
214+
Note the use of ``list()`` to materialize the iterator before modifying the document,
215+
as accepting/rejecting changes modifies the underlying XML.

pyproject.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,9 @@ filterwarnings = [
7575
# -- pytest-xdist plugin may warn about `looponfailroots` deprecation --
7676
"ignore::DeprecationWarning:xdist",
7777

78+
# -- pyparsing 3.x deprecated many method names --
79+
"ignore::DeprecationWarning:pyparsing",
80+
7881
# -- pytest complains when pytest-xdist is not installed --
7982
"ignore:Unknown config option. looponfailroots:pytest.PytestConfigWarning",
8083
]
@@ -124,4 +127,3 @@ known-local-folder = ["helpers"]
124127

125128
[tool.setuptools.dynamic]
126129
version = {attr = "docx.__version__"}
127-

src/docx/blkcntnr.py

Lines changed: 35 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212

1313
from typing_extensions import TypeAlias
1414

15+
from docx.oxml.ns import qn
1516
from docx.oxml.table import CT_Tbl
1617
from docx.oxml.text.paragraph import CT_P
1718
from docx.shared import StoryChild
@@ -23,6 +24,7 @@
2324
from docx.oxml.document import CT_Body
2425
from docx.oxml.section import CT_HdrFtr
2526
from docx.oxml.table import CT_Tc
27+
from docx.revision import TrackedDeletion, TrackedInsertion
2628
from docx.shared import Length
2729
from docx.styles.style import ParagraphStyle
2830
from docx.table import Table
@@ -71,12 +73,41 @@ def add_table(self, rows: int, cols: int, width: Length) -> Table:
7173
self._element._insert_tbl(tbl) # pyright: ignore[reportPrivateUsage]
7274
return Table(tbl, self)
7375

74-
def iter_inner_content(self) -> Iterator[Paragraph | Table]:
75-
"""Generate each `Paragraph` or `Table` in this container in document order."""
76+
def iter_inner_content(
77+
self, include_revisions: bool = False
78+
) -> Iterator[Paragraph | Table | TrackedInsertion | TrackedDeletion]:
79+
"""Generate each `Paragraph` or `Table` in this container in document order.
80+
81+
Args:
82+
include_revisions: If True, also yields `TrackedInsertion` and
83+
`TrackedDeletion` objects for block-level tracked changes
84+
(`w:ins` and `w:del` elements that wrap paragraphs or tables).
85+
Defaults to False for backward compatibility.
86+
87+
Yields:
88+
Paragraph, Table, TrackedInsertion, or TrackedDeletion objects in
89+
document order.
90+
"""
91+
from docx.revision import TrackedDeletion, TrackedInsertion
7692
from docx.table import Table
7793

78-
for element in self._element.inner_content_elements:
79-
yield (Paragraph(element, self) if isinstance(element, CT_P) else Table(element, self))
94+
if include_revisions:
95+
elements = getattr(self._element, "inner_content_with_revisions", None)
96+
if elements is None:
97+
elements = self._element.inner_content_elements
98+
else:
99+
elements = self._element.inner_content_elements
100+
101+
for element in elements:
102+
tag = element.tag # pyright: ignore[reportUnknownMemberType]
103+
if tag == qn("w:p"):
104+
yield Paragraph(element, self)
105+
elif tag == qn("w:tbl"):
106+
yield Table(element, self)
107+
elif tag == qn("w:ins"):
108+
yield TrackedInsertion(element, self) # pyright: ignore[reportArgumentType]
109+
elif tag == qn("w:del"):
110+
yield TrackedDeletion(element, self) # pyright: ignore[reportArgumentType]
80111

81112
@property
82113
def paragraphs(self):

src/docx/document.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -229,6 +229,46 @@ def tables(self) -> List[Table]:
229229
"""
230230
return self._body.tables
231231

232+
def find_and_replace_tracked(
233+
self,
234+
search_text: str,
235+
replace_text: str,
236+
author: str = "",
237+
comment: str | None = None,
238+
) -> int:
239+
"""Find and replace all occurrences of `search_text` with `replace_text` using track changes.
240+
241+
This method searches all paragraphs in the document (including those in tables)
242+
and replaces text at the word level, creating tracked deletions and insertions.
243+
If `comment` is provided, a comment is attached to each replacement explaining
244+
the change.
245+
246+
Args:
247+
search_text: Text to find and replace.
248+
replace_text: Text to insert in place of search_text.
249+
author: Author name for the revision. Defaults to empty string.
250+
comment: Optional comment text to attach to each replacement.
251+
252+
Returns:
253+
The total number of replacements made across the document.
254+
"""
255+
total_count = 0
256+
257+
for para in self.paragraphs:
258+
total_count += para.replace_tracked(
259+
search_text, replace_text, author=author, comment=comment
260+
)
261+
262+
for table in self.tables:
263+
for row in table.rows:
264+
for cell in row.cells:
265+
for para in cell.paragraphs:
266+
total_count += para.replace_tracked(
267+
search_text, replace_text, author=author, comment=comment
268+
)
269+
270+
return total_count
271+
232272
@property
233273
def _block_width(self) -> Length:
234274
"""A |Length| object specifying the space between margins in last section."""

0 commit comments

Comments
 (0)