Skip to content

gh-132372: Speed up logging.config existing logger handling#150242

Open
esadomer wants to merge 4 commits into
python:mainfrom
esadomer:codex/find-stronger-cpython-speedup
Open

gh-132372: Speed up logging.config existing logger handling#150242
esadomer wants to merge 4 commits into
python:mainfrom
esadomer:codex/find-stronger-cpython-speedup

Conversation

@esadomer
Copy link
Copy Markdown

@esadomer esadomer commented May 22, 2026

gh-132372

This speeds up logging.config.fileConfig() and logging.config.dictConfig() when they need to handle many existing loggers while disable_existing_loggers is active.

The old code kept the existing logger names in a sorted list, but still used repeated list membership, index(), tail scans, and remove() calls for every configured logger. This keeps the sorted list for child logger ordering, uses a set for membership/removal tracking, and uses bisect_left() to jump directly to the dotted child logger range.

Validation:

  • PCbuild\amd64\python_d.exe -m test test_logging -j0
  • PCbuild\amd64\python_d.exe Tools\patchcheck\patchcheck.py
  • git diff --check

Benchmark:

Loaded upstream/main:Lib/logging/config.py and the patched Lib/logging/config.py into the same Python 3.12.8 release interpreter, then timed only the dictConfig() call while resetting identical existing logger state outside the timer.

Interpreter details:

3.12.8 (tags/v3.12.8:2dc476b, Dec  3 2024, 19:30:04) [MSC v.1942 64 bit (AMD64)]
executable: C:\Users\user\AppData\Local\Programs\Python\Python312\python.exe
Py_DEBUG: None
has gettotalrefcount: False

Small n results are noisy and effectively neutral:

old n=1    mean=    24.11 us median=    20.70 us stdev=   42.46 us
new n=1    mean=    21.29 us median=    20.30 us stdev=    5.53 us
speedup n=1: 1.13x

old n=2    mean=    31.05 us median=    29.30 us stdev=    9.12 us
new n=2    mean=    32.84 us median=    29.80 us stdev=   10.27 us
speedup n=2: 0.95x

old n=3    mean=    42.36 us median=    38.70 us stdev=   11.44 us
new n=3    mean=    41.17 us median=    38.20 us stdev=   10.27 us
speedup n=3: 1.03x

Larger logger sets benefit more clearly:

old n=100  mean=  5784.59 us median=  5724.35 us stdev=  264.00 us
new n=100  mean=  4001.09 us median=  3936.50 us stdev=  255.22 us
speedup n=100: 1.45x

old n=500  mean=134692.43 us median=133170.10 us stdev= 9087.73 us
new n=500  mean= 84598.86 us median= 84082.45 us stdev= 2047.04 us
speedup n=500: 1.59x

old n=1000 mean=541349.97 us median=537541.85 us stdev=13930.13 us
new n=1000 mean=338080.61 us median=333050.55 us stdev=14305.36 us
speedup n=1000: 1.60x

@eendebakpt
Copy link
Copy Markdown
Contributor

The performance increase is moderate, but the PR reduces code duplication so I am +1.

@esadomer On your benchmark I could get a much larger speedud with this small commit 2b9ce22 (on top of your work). I have not fully tested this, but the speedup is > 5x. If you are interested you could test it further and pull into your branch.

@esadomer esadomer force-pushed the codex/find-stronger-cpython-speedup branch from d599b2e to c1a3122 Compare May 23, 2026 06:21
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
@esadomer esadomer force-pushed the codex/find-stronger-cpython-speedup branch from c1a3122 to 02cf98f Compare May 23, 2026 06:24
@picnixz
Copy link
Copy Markdown
Member

picnixz commented May 23, 2026

I would appreciate the numbers for n=1,2,3. It's not realistic to assume that there are 500 loggers. I have honestly never seen that in production. But the bisect solution is something I would have used as well so I'm also fine with it.

Comment thread Lib/logging/config.py Outdated
if not child.startswith(prefixed):
break
if child in existing_set:
child_loggers.add(child)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a dict here instaed of a set to retain the insertion time.

Comment thread Lib/logging/config.py Outdated
prefixed = name + "."
i = bisect_left(existing, prefixed)
num_existing = len(existing)
while i < num_existing:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative here would be to use islice(existing, i) which would avoid using i+=1 but this adds an import to itertools.

Comment thread Lib/logging/config.py Outdated

def _discard_existing_logger(name, existing, existing_set, child_loggers):
"""Discard a configured logger and record its existing children."""
if name in existing_set:
Copy link
Copy Markdown
Member

@picnixz picnixz May 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is named discard_existing_logger but we have a check "if name in existing_set", which does not make sense semantically.

It also updates child_loggers so it does 2 things. I don't have a better name though but I would prefer that you remove the "if name in existing_set" check and make it the responsibility of the caller instead. It will slightly reduce the diff and the indentation of the helper (and you save 1 function call when name is not existing)

@esadomer
Copy link
Copy Markdown
Author

Thanks, I updated the branch to address the inline comments:

  • child_loggers now uses a dict so insertion order is retained.
  • The membership check moved to the callers, so _discard_existing_logger() now only handles existing logger names.
  • I left the islice() alternative out for now since it would add another import for a cosmetic loop change.

I also added the small n numbers to the PR description. They look effectively neutral/noisy for n=1..3:

n=1: 0.98x
n=2: 1.07x
n=3: 1.00x

Validation after the change:

  • PCbuild\amd64\python_d.exe -m test test_logging -j0
  • PCbuild\amd64\python_d.exe Tools\patchcheck\patchcheck.py
  • git diff --check

@picnixz
Copy link
Copy Markdown
Member

picnixz commented May 23, 2026

Can you try a non-debug build as well? it looks like you're using a DEBUG build for the benchmarks. Having a release (PGO+LTO) build may help in removing noise in general (and show clear improvements)

Comment thread Lib/logging/config.py Outdated
if not isinstance(logger, logging.PlaceHolder):
logger.setLevel(logging.NOTSET)
# Equivalent to setLevel(NOTSET), but clear the cache once.
logger.level = logging.NOTSET
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change really necessary? while setLevel would clear the cache multiple times, I would prefer that we pass through logger.setLevel public API instead of changing .level directly, just in case someone has a logger subclass that does something else with setLevel.

Comment thread Lib/logging/config.py Outdated
if cache_clear_needed:
root.manager._clear_cache()

def _discard_existing_logger(name, existing, existing_set, child_loggers):
Copy link
Copy Markdown
Member

@picnixz picnixz May 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _discard_existing_logger(name, existing, existing_set, child_loggers):
def _forget_existing_logger(name, existing, existing_set, child_loggers):

Maybe something like that instead (since we also update child_loggers)

@esadomer
Copy link
Copy Markdown
Author

Thanks, good catch.

I pushed e8e15d6 to keep preserved child loggers on the public Logger.setLevel(NOTSET) path instead of assigning .level directly, and renamed the helper to _forget_existing_logger().

I also updated the PR body with fresh numbers after that change. To clarify the benchmark setup: the python_d.exe commands were validation only; the benchmark was run with a non-debug Python 3.12.8 release interpreter (Py_DEBUG: None, no sys.gettotalrefcount). Small n is still noisy/neutral, while the larger cases are now around 1.45x to 1.60x.

I tried to produce a local CPython Windows Release/PGO build too, but this machine fails before creating PCbuild\amd64\python.exe with an unrelated MSVC C1001 in zlib-ng's arch\x86\chunkset_avx2.c. So I don't want to claim these are PGO+LTO numbers from this box.

@esadomer
Copy link
Copy Markdown
Author

CI note: two jobs failed, both look unrelated to this logging.config change.

  • Emscripten / build and test failed during checkout before tests ran: fatal: could not read Username for 'https://github.com': terminal prompts disabled.
  • Sanitizers / TSan (free-threading) failed in test_ssl with a ThreadSanitizer report inside OpenSSL/libcrypto (ASN1_STRING_cmp), not in logging code.

I tried to rerun the failed jobs, but I do not have Actions rerun permission on python/cpython (admin rights / Resource not accessible by integration).

@picnixz
Copy link
Copy Markdown
Member

picnixz commented May 23, 2026

Ok, release build is still enough. Someone else can do the benchmarks on Linux PGO+LTO later

@picnixz
Copy link
Copy Markdown
Member

picnixz commented May 23, 2026

As for GH failures, it's known (I think GH is unstable currently so don't worry)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants