@@ -389,7 +389,8 @@ Legend:
389389* ``malloc ``: system allocators from the standard C library, C functions:
390390 :c:func: `malloc `, :c:func: `calloc `, :c:func: `realloc ` and :c:func: `free `.
391391* ``pymalloc ``: :ref: `pymalloc memory allocator <pymalloc >`.
392- * "+ debug": with debug hooks installed by :c:func: `PyMem_SetupDebugHooks `.
392+ * "+ debug": with :ref: `debug hooks on the Python memory allocators
393+ <pymem-debug-hooks>`.
393394* "Debug build": :ref: `Python build in debug mode <debug-build >`.
394395
395396.. _customize-memory-allocators :
@@ -478,45 +479,113 @@ Customize Memory Allocators
478479
479480.. c:function:: void PyMem_SetupDebugHooks(void)
480481
481- Setup hooks to detect bugs in the Python memory allocator functions.
482+ Setup :ref:`debug hooks in the Python memory allocators <pymem-debug-hooks>`
483+ to detect memory errors.
484+
485+
486+ .. _pymem-debug-hooks:
487+
488+ Debug hooks on the Python memory allocators
489+ ===========================================
490+
491+ When :ref:`Python is built is debug mode <debug-build>`, the
492+ :c:func:`PyMem_SetupDebugHooks` function is called at the :ref:`Python
493+ preinitialization <c-preinit>` to setup debug hooks on Python memory allocators
494+ to detect memory errors.
495+
496+ The :envvar:`PYTHONMALLOC` environment variable can be used to install debug
497+ hooks on a Python compiled in release mode (ex: ``PYTHONMALLOC=debug ``).
498+
499+ The :c:func:`PyMem_SetupDebugHooks` function can be used to set debug hooks
500+ after calling :c:func:`PyMem_SetAllocator`.
501+
502+ These debug hooks fill dynamically allocated memory blocks with special,
503+ recognizable bit patterns. Newly allocated memory is filled with the byte
504+ ``0xCD`` (``PYMEM_CLEANBYTE ``), freed memory is filled with the byte ``0xDD``
505+ (``PYMEM_DEADBYTE ``). Memory blocks are surrounded by "forbidden bytes"
506+ filled with the byte ``0xFD`` (``PYMEM_FORBIDDENBYTE ``). Strings of these bytes
507+ are unlikely to be valid addresses, floats, or ASCII strings.
508+
509+ Runtime checks:
510+
511+ - Detect API violations. For example, detect if :c:func:`PyObject_Free` is
512+ called on a memory block allocated by :c:func:`PyMem_Malloc`.
513+ - Detect write before the start of the buffer (buffer underflow).
514+ - Detect write after the end of the buffer (buffer overflow).
515+ - Check that the :term:`GIL <global interpreter lock>` is held when
516+ allocator functions of :c:data:`PYMEM_DOMAIN_OBJ` (ex:
517+ :c:func: `PyObject_Malloc `) and :c:data:`PYMEM_DOMAIN_MEM` (ex:
518+ :c:func: `PyMem_Malloc `) domains are called.
519+
520+ On error, the debug hooks use the :mod:`tracemalloc` module to get the
521+ traceback where a memory block was allocated. The traceback is only displayed
522+ if :mod:`tracemalloc` is tracing Python memory allocations and the memory block
523+ was traced.
524+
525+ Let *S* = ``sizeof(size_t)``. ``2*S`` bytes are added at each end of each block
526+ of *N* bytes requested. The memory layout is like so, where p represents the
527+ address returned by a malloc-like or realloc-like function (``p[i:j] `` means
528+ the slice of bytes from ``*(p+i) `` inclusive up to ``*(p+j) `` exclusive; note
529+ that the treatment of negative indices differs from a Python slice):
530+
531+ ``p[-2*S:-S]``
532+ Number of bytes originally asked for. This is a size_t, big-endian (easier
533+ to read in a memory dump).
534+ ``p[-S]``
535+ API identifier (ASCII character):
536+
537+ * ``'r'`` for :c:data:`PYMEM_DOMAIN_RAW`.
538+ * ``'m'`` for :c:data:`PYMEM_DOMAIN_MEM`.
539+ * ``'o'`` for :c:data:`PYMEM_DOMAIN_OBJ`.
540+
541+ ``p[-S+1:0]``
542+ Copies of PYMEM_FORBIDDENBYTE. Used to catch under- writes and reads.
543+
544+ ``p[0:N]``
545+ The requested memory, filled with copies of PYMEM_CLEANBYTE, used to catch
546+ reference to uninitialized memory. When a realloc-like function is called
547+ requesting a larger memory block, the new excess bytes are also filled with
548+ PYMEM_CLEANBYTE. When a free-like function is called, these are
549+ overwritten with PYMEM_DEADBYTE, to catch reference to freed memory. When
550+ a realloc- like function is called requesting a smaller memory block, the
551+ excess old bytes are also filled with PYMEM_DEADBYTE.
552+
553+ ``p[N:N+S]``
554+ Copies of PYMEM_FORBIDDENBYTE. Used to catch over- writes and reads.
555+
556+ ``p[N+S:N+2*S]``
557+ Only used if the ``PYMEM_DEBUG_SERIALNO`` macro is defined (not defined by
558+ default).
559+
560+ A serial number, incremented by 1 on each call to a malloc-like or
561+ realloc-like function. Big-endian ``size_t``. If "bad memory" is detected
562+ later, the serial number gives an excellent way to set a breakpoint on the
563+ next run, to capture the instant at which this block was passed out. The
564+ static function bumpserialno() in obmalloc.c is the only place the serial
565+ number is incremented, and exists so you can set such a breakpoint easily.
566+
567+ A realloc-like or free-like function first checks that the PYMEM_FORBIDDENBYTE
568+ bytes at each end are intact. If they've been altered, diagnostic output is
569+ written to stderr, and the program is aborted via Py_FatalError(). The other
570+ main failure mode is provoking a memory error when a program reads up one of
571+ the special bit patterns and tries to use it as an address. If you get in a
572+ debugger then and look at the object, you're likely to see that it's entirely
573+ filled with PYMEM_DEADBYTE (meaning freed memory is getting used) or
574+ PYMEM_CLEANBYTE (meaning uninitialized memory is getting used).
482575
483- Newly allocated memory is filled with the byte ``0xCD`` (``CLEANBYTE ``),
484- freed memory is filled with the byte ``0xDD`` (``DEADBYTE ``). Memory blocks
485- are surrounded by "forbidden bytes" (``FORBIDDENBYTE ``: byte ``0xFD ``).
486-
487- Runtime checks:
488-
489- - Detect API violations, ex: :c:func:`PyObject_Free` called on a buffer
490- allocated by :c:func:`PyMem_Malloc`
491- - Detect write before the start of the buffer (buffer underflow)
492- - Detect write after the end of the buffer (buffer overflow)
493- - Check that the :term:`GIL <global interpreter lock>` is held when
494- allocator functions of :c:data:`PYMEM_DOMAIN_OBJ` (ex:
495- :c:func: `PyObject_Malloc `) and :c:data:`PYMEM_DOMAIN_MEM` (ex:
496- :c:func: `PyMem_Malloc `) domains are called
497-
498- On error, the debug hooks use the :mod:`tracemalloc` module to get the
499- traceback where a memory block was allocated. The traceback is only
500- displayed if :mod:`tracemalloc` is tracing Python memory allocations and the
501- memory block was traced.
502-
503- These hooks are :ref:`installed by default <default-memory-allocators>` if
504- :ref:`Python is built in debug mode <debug-build>`.
505- The :envvar:`PYTHONMALLOC` environment variable can be used to install
506- debug hooks on a Python compiled in release mode.
507-
508- .. versionchanged:: 3.6
509- This function now also works on Python compiled in release mode.
510- On error, the debug hooks now use :mod:`tracemalloc` to get the traceback
511- where a memory block was allocated. The debug hooks now also check
512- if the GIL is held when functions of :c:data:`PYMEM_DOMAIN_OBJ` and
513- :c:data:`PYMEM_DOMAIN_MEM` domains are called.
576+ .. versionchanged:: 3.6
577+ The :c:func:`PyMem_SetupDebugHooks` function now also works on Python
578+ compiled in release mode. On error, the debug hooks now use
579+ :mod:`tracemalloc` to get the traceback where a memory block was allocated.
580+ The debug hooks now also check if the GIL is held when functions of
581+ :c:data:`PYMEM_DOMAIN_OBJ` and :c:data:`PYMEM_DOMAIN_MEM` domains are
582+ called.
514583
515- .. versionchanged:: 3.8
516- Byte patterns ``0xCB`` (``CLEANBYTE ``), ``0xDB`` (``DEADBYTE ``) and
517- ``0xFB`` (``FORBIDDENBYTE ``) have been replaced with ``0xCD``, ``0xDD``
518- and ``0xFD`` to use the same values than Windows CRT debug ``malloc()``
519- and ``free()``.
584+ .. versionchanged:: 3.8
585+ Byte patterns ``0xCB`` (``PYMEM_CLEANBYTE ``), ``0xDB`` (``PYMEM_DEADBYTE ``)
586+ and ``0xFB`` (``PYMEM_FORBIDDENBYTE ``) have been replaced with ``0xCD``,
587+ ``0xDD`` and ``0xFD`` to use the same values than Windows CRT debug
588+ ``malloc()`` and ``free()``.
520589
521590
522591.. _pymalloc:
@@ -539,6 +608,10 @@ The arena allocator uses the following functions:
539608* :c:func:`mmap` and :c:func:`munmap` if available,
540609* :c:func:`malloc` and :c:func:`free` otherwise.
541610
611+ This allocator is disabled if Python is configured with the
612+ :option:`--without-pymalloc` option. It can also be disabled at runtime using
613+ the :envvar:`PYTHONMALLOC` environment variable (ex: ``PYTHONMALLOC=malloc ``).
614+
542615Customize pymalloc Arena Allocator
543616----------------------------------
544617
0 commit comments