Skip to content

Commit fd9ebd4

Browse files
committed
Clarify concatenation behaviour of immutable strings, and remove explicit
mention of the CPython optimization hack.
1 parent 5a53f36 commit fd9ebd4

2 files changed

Lines changed: 38 additions & 9 deletions

File tree

Doc/faq/programming.rst

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -989,6 +989,32 @@ What does 'UnicodeDecodeError' or 'UnicodeEncodeError' error mean?
989989
See the :ref:`unicode-howto`.
990990

991991

992+
What is the most efficient way to concatenate many strings together?
993+
--------------------------------------------------------------------
994+
995+
:class:`str` and :class:`bytes` objects are immutable, therefore concatenating
996+
many strings together is inefficient as each concatenation creates a new
997+
object. In the general case, the total runtime cost is quadratic in the
998+
total string length.
999+
1000+
To accumulate many :class:`str` objects, the recommended idiom is to place
1001+
them into a list and call :meth:`str.join` at the end::
1002+
1003+
chunks = []
1004+
for s in my_strings:
1005+
chunks.append(s)
1006+
result = ''.join(chunks)
1007+
1008+
(another reasonably efficient idiom is to use :class:`io.StringIO`)
1009+
1010+
To accumulate many :class:`bytes` objects, the recommended idiom is to extend
1011+
a :class:`bytearray` object using in-place concatenation (the ``+=`` operator)::
1012+
1013+
result = bytearray()
1014+
for b in my_bytes_objects:
1015+
result += b
1016+
1017+
9921018
Sequences (Tuples/Lists)
9931019
========================
9941020

Doc/library/stdtypes.rst

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -964,15 +964,18 @@ Notes:
964964
If *k* is ``None``, it is treated like ``1``.
965965

966966
(6)
967-
.. impl-detail::
968-
969-
If *s* and *t* are both strings, some Python implementations such as
970-
CPython can usually perform an in-place optimization for assignments of
971-
the form ``s = s + t`` or ``s += t``. When applicable, this optimization
972-
makes quadratic run-time much less likely. This optimization is both
973-
version and implementation dependent. For performance sensitive code, it
974-
is preferable to use the :meth:`str.join` method which assures consistent
975-
linear concatenation performance across versions and implementations.
967+
Concatenating immutable strings always results in a new object. This means
968+
that building up a string by repeated concatenation will have a quadratic
969+
runtime cost in the total string length. To get a linear runtime cost,
970+
you must switch to one of the alternatives below:
971+
972+
* if concatenating :class:`str` objects, you can build a list and use
973+
:meth:`str.join` at the end;
974+
975+
* if concatenating :class:`bytes` objects, you can similarly use
976+
:meth:`bytes.join`, or you can do in-place concatenation with a
977+
:class:`bytearray` object. :class:`bytearray` objects are mutable and
978+
have an efficient overallocation mechanism.
976979

977980

978981
.. _string-methods:

0 commit comments

Comments
 (0)