From 27239f276a22f25de0d8a3a8ece6a47f4d4f2828 Mon Sep 17 00:00:00 2001 From: Ying Xu Date: Fri, 22 May 2026 16:46:52 +0800 Subject: [PATCH] Update `untokenize.rst` documentation to reflect current implementation --- Doc/library/tokenize.rst | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/Doc/library/tokenize.rst b/Doc/library/tokenize.rst index 2eea51734fde03c..77b61e6976d6c62 100644 --- a/Doc/library/tokenize.rst +++ b/Doc/library/tokenize.rst @@ -85,17 +85,23 @@ write back the modified script. .. function:: untokenize(iterable) Converts tokens back into Python source code. The *iterable* must return - sequences with at least two elements, the token type and the token string. - Any additional sequence elements are ignored. + sequences with either two or five elements. The result is guaranteed to tokenize back to match the input so that the - conversion is lossless and round-trips are assured. The guarantee applies - only to the token type and token string as the spacing between tokens - (column positions) may change. + conversion is lossless and round-trips are assured. + + If *iterable* returns sequences with two elements (the token type and token + string), the result will tokenize back to the same token types and strings as + the input, but the spacing between tokens (column positions) may change. + + If *iterable* returns sequences with five elements + (``type token string start end line``), the column positions are preserved + and the result will tokenize back to match the input exactly. + It returns bytes, encoded using the :data:`~token.ENCODING` token, which is the first token sequence output by :func:`.tokenize`. If there is no - encoding token in the input, it returns a str instead. + encoding token in the input, it returns a :class:`str` instead. :func:`.tokenize` needs to detect the encoding of source files it tokenizes. The