Skip to content

Commit 6272490

Browse files
author
benjamin.peterson
committed
update the struct documentation to refer to bytes
patch from Matt Giuca #3478 git-svn-id: http://svn.python.org/projects/python/branches/py3k@65327 6015fed2-1504-0410-9fe1-9d1591cc4771
1 parent 5171f25 commit 6272490

3 files changed

Lines changed: 48 additions & 42 deletions

File tree

Doc/ACKS.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ docs@python.org), and we'll be glad to correct the problem.
6262
* Ben Gertzfield
6363
* Nadim Ghaznavi
6464
* Jonathan Giddy
65+
* Matt Giuca
6566
* Shelley Gooch
6667
* Nathaniel Gray
6768
* Grant Griffin

Doc/library/struct.rst

Lines changed: 37 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11

2-
:mod:`struct` --- Interpret strings as packed binary data
2+
:mod:`struct` --- Interpret bytes as packed binary data
33
=========================================================
44

55
.. module:: struct
6-
:synopsis: Interpret strings as packed binary data.
6+
:synopsis: Interpret bytes as packed binary data.
77

88
.. index::
99
pair: C; structures
1010
triple: packing; binary; data
1111

1212
This module performs conversions between Python values and C structs represented
13-
as Python strings. It uses :dfn:`format strings` (explained below) as compact
14-
descriptions of the lay-out of the C structs and the intended conversion to/from
15-
Python values. This can be used in handling binary data stored in files or from
16-
network connections, among other sources.
13+
as Python :class:`bytes` objects. It uses :dfn:`format strings` (explained
14+
below) as compact descriptions of the lay-out of the C structs and the
15+
intended conversion to/from Python values. This can be used in handling
16+
binary data stored in files or from network connections, among other sources.
1717

1818
The module defines the following exception and functions:
1919

@@ -26,7 +26,7 @@ The module defines the following exception and functions:
2626

2727
.. function:: pack(fmt, v1, v2, ...)
2828

29-
Return a string containing the values ``v1, v2, ...`` packed according to the
29+
Return a bytes containing the values ``v1, v2, ...`` packed according to the
3030
given format. The arguments must match the values required by the format
3131
exactly.
3232

@@ -38,12 +38,12 @@ The module defines the following exception and functions:
3838
a required argument.
3939

4040

41-
.. function:: unpack(fmt, string)
41+
.. function:: unpack(fmt, bytes)
4242

43-
Unpack the string (presumably packed by ``pack(fmt, ...)``) according to the
43+
Unpack the bytes (presumably packed by ``pack(fmt, ...)``) according to the
4444
given format. The result is a tuple even if it contains exactly one item. The
45-
string must contain exactly the amount of data required by the format
46-
(``len(string)`` must equal ``calcsize(fmt)``).
45+
bytes must contain exactly the amount of data required by the format
46+
(``len(bytes)`` must equal ``calcsize(fmt)``).
4747

4848

4949
.. function:: unpack_from(fmt, buffer[,offset=0])
@@ -56,7 +56,7 @@ The module defines the following exception and functions:
5656

5757
.. function:: calcsize(fmt)
5858

59-
Return the size of the struct (and hence of the string) corresponding to the
59+
Return the size of the struct (and hence of the bytes) corresponding to the
6060
given format.
6161

6262
Format characters have the following meaning; the conversion between C and
@@ -67,13 +67,13 @@ Python values should be obvious given their types:
6767
+========+=========================+====================+=======+
6868
| ``x`` | pad byte | no value | |
6969
+--------+-------------------------+--------------------+-------+
70-
| ``c`` | :ctype:`char` | string of length 1 | |
70+
| ``c`` | :ctype:`char` | bytes of length 1 | |
7171
+--------+-------------------------+--------------------+-------+
72-
| ``b`` | :ctype:`signed char` | integer | |
72+
| ``b`` | :ctype:`signed char` | integer | \(1) |
7373
+--------+-------------------------+--------------------+-------+
7474
| ``B`` | :ctype:`unsigned char` | integer | |
7575
+--------+-------------------------+--------------------+-------+
76-
| ``?`` | :ctype:`_Bool` | bool | \(1) |
76+
| ``?`` | :ctype:`_Bool` | bool | \(2) |
7777
+--------+-------------------------+--------------------+-------+
7878
| ``h`` | :ctype:`short` | integer | |
7979
+--------+-------------------------+--------------------+-------+
@@ -87,30 +87,35 @@ Python values should be obvious given their types:
8787
+--------+-------------------------+--------------------+-------+
8888
| ``L`` | :ctype:`unsigned long` | integer | |
8989
+--------+-------------------------+--------------------+-------+
90-
| ``q`` | :ctype:`long long` | integer | \(2) |
90+
| ``q`` | :ctype:`long long` | integer | \(3) |
9191
+--------+-------------------------+--------------------+-------+
92-
| ``Q`` | :ctype:`unsigned long | integer | \(2) |
92+
| ``Q`` | :ctype:`unsigned long | integer | \(3) |
9393
| | long` | | |
9494
+--------+-------------------------+--------------------+-------+
9595
| ``f`` | :ctype:`float` | float | |
9696
+--------+-------------------------+--------------------+-------+
9797
| ``d`` | :ctype:`double` | float | |
9898
+--------+-------------------------+--------------------+-------+
99-
| ``s`` | :ctype:`char[]` | string | |
99+
| ``s`` | :ctype:`char[]` | bytes | \(1) |
100100
+--------+-------------------------+--------------------+-------+
101-
| ``p`` | :ctype:`char[]` | string | |
101+
| ``p`` | :ctype:`char[]` | bytes | \(1) |
102102
+--------+-------------------------+--------------------+-------+
103103
| ``P`` | :ctype:`void \*` | integer | |
104104
+--------+-------------------------+--------------------+-------+
105105

106106
Notes:
107107

108108
(1)
109+
The ``c``, ``s`` and ``p`` conversion codes operate on :class:`bytes`
110+
objects, but packing with such codes also supports :class:`str` objects,
111+
which are encoded using UTF-8.
112+
113+
(2)
109114
The ``'?'`` conversion code corresponds to the :ctype:`_Bool` type defined by
110115
C99. If this type is not available, it is simulated using a :ctype:`char`. In
111116
standard mode, it is always represented by one byte.
112117

113-
(2)
118+
(3)
114119
The ``'q'`` and ``'Q'`` conversion codes are available in native mode only if
115120
the platform C compiler supports C :ctype:`long long`, or, on Windows,
116121
:ctype:`__int64`. They are always available in standard modes.
@@ -121,11 +126,11 @@ the format string ``'4h'`` means exactly the same as ``'hhhh'``.
121126
Whitespace characters between formats are ignored; a count and its format must
122127
not contain whitespace though.
123128

124-
For the ``'s'`` format character, the count is interpreted as the size of the
125-
string, not a repeat count like for the other format characters; for example,
129+
For the ``'s'`` format character, the count is interpreted as the length of the
130+
bytes, not a repeat count like for the other format characters; for example,
126131
``'10s'`` means a single 10-byte string, while ``'10c'`` means 10 characters.
127132
For packing, the string is truncated or padded with null bytes as appropriate to
128-
make it fit. For unpacking, the resulting string always has exactly the
133+
make it fit. For unpacking, the resulting bytes object always has exactly the
129134
specified number of bytes. As a special case, ``'0s'`` means a single, empty
130135
string (while ``'0c'`` means 0 characters).
131136

@@ -137,7 +142,7 @@ passed in to :func:`pack` is too long (longer than the count minus 1), only the
137142
leading count-1 bytes of the string are stored. If the string is shorter than
138143
count-1, it is padded with null bytes so that exactly count bytes in all are
139144
used. Note that for :func:`unpack`, the ``'p'`` format character consumes count
140-
bytes, but that the string returned can never contain more than 255 characters.
145+
bytes, but that the string returned can never contain more than 255 bytes.
141146

142147

143148

@@ -203,8 +208,8 @@ machine)::
203208

204209
>>> from struct import *
205210
>>> pack('hhl', 1, 2, 3)
206-
'\x00\x01\x00\x02\x00\x00\x00\x03'
207-
>>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03')
211+
b'\x00\x01\x00\x02\x00\x00\x00\x03'
212+
>>> unpack('hhl', b'\x00\x01\x00\x02\x00\x00\x00\x03')
208213
(1, 2, 3)
209214
>>> calcsize('hhl')
210215
8
@@ -219,13 +224,13 @@ enforce any alignment.
219224
Unpacked fields can be named by assigning them to variables or by wrapping
220225
the result in a named tuple::
221226

222-
>>> record = 'raymond \x32\x12\x08\x01\x08'
227+
>>> record = b'raymond \x32\x12\x08\x01\x08'
223228
>>> name, serialnum, school, gradelevel = unpack('<10sHHb', record)
224229

225230
>>> from collections import namedtuple
226231
>>> Student = namedtuple('Student', 'name serialnum school gradelevel')
227-
>>> Student._make(unpack('<10sHHb', s))
228-
Student(name='raymond ', serialnum=4658, school=264, gradelevel=8)
232+
>>> Student._make(unpack('<10sHHb', record))
233+
Student(name=b'raymond ', serialnum=4658, school=264, gradelevel=8)
229234

230235
.. seealso::
231236

@@ -265,10 +270,10 @@ The :mod:`struct` module also defines the following type:
265270
Identical to the :func:`pack_into` function, using the compiled format.
266271

267272

268-
.. method:: unpack(string)
273+
.. method:: unpack(bytes)
269274

270275
Identical to the :func:`unpack` function, using the compiled format.
271-
(``len(string)`` must equal :attr:`self.size`).
276+
(``len(bytes)`` must equal :attr:`self.size`).
272277

273278

274279
.. method:: unpack_from(buffer[, offset=0])
@@ -283,6 +288,6 @@ The :mod:`struct` module also defines the following type:
283288

284289
.. attribute:: size
285290

286-
The calculated size of the struct (and hence of the string) corresponding
291+
The calculated size of the struct (and hence of the bytes) corresponding
287292
to :attr:`format`.
288293

Modules/_struct.c

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
/* struct module -- pack values into and (out of) strings */
1+
/* struct module -- pack values into and (out of) bytes objects */
22

33
/* New version supporting byte order, alignment and size options,
44
character strings, and unsigned numbers */
@@ -610,7 +610,7 @@ np_char(char *p, PyObject *v, const formatdef *f)
610610
}
611611
if (!PyBytes_Check(v) || PyBytes_Size(v) != 1) {
612612
PyErr_SetString(StructError,
613-
"char format requires string of length 1");
613+
"char format requires bytes or string of length 1");
614614
return -1;
615615
}
616616
*p = *PyBytes_AsString(v);
@@ -1654,7 +1654,7 @@ s_pack_internal(PyStructObject *soself, PyObject *args, int offset, char* buf)
16541654
isstring = PyBytes_Check(v);
16551655
if (!isstring && !PyByteArray_Check(v)) {
16561656
PyErr_SetString(StructError,
1657-
"argument for 's' must be a string");
1657+
"argument for 's' must be a bytes or string");
16581658
return -1;
16591659
}
16601660
if (isstring) {
@@ -1680,7 +1680,7 @@ s_pack_internal(PyStructObject *soself, PyObject *args, int offset, char* buf)
16801680
isstring = PyBytes_Check(v);
16811681
if (!isstring && !PyByteArray_Check(v)) {
16821682
PyErr_SetString(StructError,
1683-
"argument for 'p' must be a string");
1683+
"argument for 'p' must be a bytes or string");
16841684
return -1;
16851685
}
16861686
if (isstring) {
@@ -1714,9 +1714,9 @@ s_pack_internal(PyStructObject *soself, PyObject *args, int offset, char* buf)
17141714

17151715

17161716
PyDoc_STRVAR(s_pack__doc__,
1717-
"S.pack(v1, v2, ...) -> string\n\
1717+
"S.pack(v1, v2, ...) -> bytes\n\
17181718
\n\
1719-
Return a string containing values v1, v2, ... packed according to this\n\
1719+
Return a bytes containing values v1, v2, ... packed according to this\n\
17201720
Struct's format. See struct.__doc__ for more on format strings.");
17211721

17221722
static PyObject *
@@ -1944,7 +1944,7 @@ calcsize(PyObject *self, PyObject *fmt)
19441944
}
19451945

19461946
PyDoc_STRVAR(pack_doc,
1947-
"Return string containing values v1, v2, ... packed according to fmt.");
1947+
"Return bytes containing values v1, v2, ... packed according to fmt.");
19481948

19491949
static PyObject *
19501950
pack(PyObject *self, PyObject *args)
@@ -2003,8 +2003,8 @@ pack_into(PyObject *self, PyObject *args)
20032003
}
20042004

20052005
PyDoc_STRVAR(unpack_doc,
2006-
"Unpack the string containing packed C structure data, according to fmt.\n\
2007-
Requires len(string) == calcsize(fmt).");
2006+
"Unpack the bytes containing packed C structure data, according to fmt.\n\
2007+
Requires len(bytes) == calcsize(fmt).");
20082008

20092009
static PyObject *
20102010
unpack(PyObject *self, PyObject *args)
@@ -2068,7 +2068,7 @@ static struct PyMethodDef module_functions[] = {
20682068

20692069
PyDoc_STRVAR(module_doc,
20702070
"Functions to convert between Python values and C structs.\n\
2071-
Python strings are used to hold the data representing the C struct\n\
2071+
Python bytes objects are used to hold the data representing the C struct\n\
20722072
and also as format strings to describe the layout of data in the C struct.\n\
20732073
\n\
20742074
The optional first format char indicates byte order, size and alignment:\n\

0 commit comments

Comments
 (0)