Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
ccd0bbc
draft: impl lazy input consumption in mp.Pool.imap(_unordered)
Jul 20, 2025
002ef46
Use semaphore to synchronize threads
Jul 20, 2025
6e0bc58
Update buffersize behavior to match concurrent.futures.Executor behavior
Jul 21, 2025
62b2b6a
Release all `buffersize_lock` obj from the parent thread when terminate
Jul 21, 2025
0b6ba41
Add 2 basic `ThreadPool.imap()` tests w/ and w/o buffersize
Jul 21, 2025
aade15e
Fix accidental swap in imports
Jul 21, 2025
fb38a72
clear Pool._taskqueue_buffersize_semaphores safely
Jul 21, 2025
6ef488b
Slightly optimize Pool._taskqueue_buffersize_semaphores terminate
Jul 21, 2025
1716725
Rename `Pool.imap()` buffersize-related tests
Jul 21, 2025
9b43cd0
Fix typo in `IMapIterator.__init__()`
Jul 22, 2025
2d89341
Add tests for buffersize combinations with other kwargs
Jul 22, 2025
9ab2705
Remove if-branch in `_terminate_pool`
Jul 27, 2025
a955003
Add more edge-case tests for `imap` and `imap_unodered`
Jul 27, 2025
80efd6e
Split inf iterable test for `imap` and `imap_unordered`
Jul 27, 2025
83d6930
Add doc for `buffersize` argument of `imap` and `imap_unordered`
Jul 27, 2025
995ad8c
add *versionadded* for `imap_unordered`
Jul 28, 2025
3b6ad65
Remove ambiguity in `buffersize` description.
Jul 28, 2025
c941c16
Set *versionadded* as next in docs
Jul 28, 2025
d09e891
Add whatsnew entry
Jul 28, 2025
9c6d89d
Fix aggreed comments on code formatting/minor refactoring
Jul 28, 2025
4550a01
Remove `imap` and `imap_unordered` body code duplication
Jul 28, 2025
77bde4d
Merge branch 'main' into feature/add-buffersize-to-multiprocessing
obaltian Aug 31, 2025
aec39fc
Merge branch 'main' into feature/add-buffersize-to-multiprocessing
obaltian Sep 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add doc for buffersize argument of imap and imap_unordered
  • Loading branch information
Oleksandr Baltian authored and obaltian committed Aug 14, 2025
commit 83d69306d1b6d0828c783e8072f4d4baf1ab4cd1
15 changes: 13 additions & 2 deletions Doc/library/multiprocessing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2448,7 +2448,7 @@ with the :class:`Pool` class.
Callbacks should complete immediately since otherwise the thread which
handles the results will get blocked.

.. method:: imap(func, iterable[, chunksize])
.. method:: imap(func, iterable[, chunksize[, buffersize]])

A lazier version of :meth:`.map`.

Expand All @@ -2462,7 +2462,18 @@ with the :class:`Pool` class.
``next(timeout)`` will raise :exc:`multiprocessing.TimeoutError` if the
result cannot be returned within *timeout* seconds.

.. method:: imap_unordered(func, iterable[, chunksize])
The *iterable* is collected immediately rather than lazily, unless a
*buffersize* is specified to limit the number of submitted tasks whose
results have not yet been yielded. If the buffer is full, iteration over
the *iterables* pauses until a result is yielded from the buffer.
To fully utilize pool's capacity, set *buffersize* to the number of
processes in pool (to consume *iterable* as you go) or even higher
(to prefetch *buffersize - processes* arguments).
Copy link
Copy Markdown
Author

@obaltian obaltian Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was questioning myself whether we should also describe the difference in buffersize usefulness
between multiprocessing.Pool and multiprocessing.ThreadPool, I would be glad to hear an opinion on that – what do you think?

This feature is more useful with multiprocessing.ThreadPool class, where user can pass generator as iterable. multiprocessing.Pool with processes currently can't accept generators as they aren't picklable, so the user still needs to pass iterable as, for example, list, which is O(n). However, there is another huge benefit to using it – tasks will also be submitted lazily (while user iterates over results), and not-needed-yet results won't stack up in memory. So I think the feature is useful for any kind of pool and docs shouldn't suggest to use it specifically for threads.


.. versionadded:: 3.15
Comment thread
obaltian marked this conversation as resolved.
Outdated
Added the *buffersize* parameter.

.. method:: imap_unordered(func, iterable[, chunksize[, buffersize]])

The same as :meth:`imap` except that the ordering of the results from the
returned iterator should be considered arbitrary. (Only when there is
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Add the optional ``buffersize`` parameter to
:meth:`multiprocessing.pool.Pool.imap` and
:meth:`multiprocessing.pool.Pool.imap_unordered` to limit the number of
submitted tasks whose results have not yet been yielded. If the buffer is
full, iteration over the *iterables* pauses until a result is yielded from
the buffer. To fully utilize pool's capacity, set *buffersize* to the number
of processes in pool (to consume *iterable* as you go) or even higher (to
prefetch *buffersize - processes* arguments).