Skip to content

Speed up SimpleHTTPRequestHandler.list_directory() by using os.scandir() #151788

@mjbommar

Description

@mjbommar

SimpleHTTPRequestHandler.list_directory() calls os.listdir() and then, for every entry, os.path.isdir() (a stat) and os.path.islink() (an lstat) — two stat-family syscalls per entry. This is wasted work on any filesystem and dominates listing time for large directories; on network filesystems like NFS, where each call is a round-trip, it becomes severe.

os.scandir() returns the entry type from the directory read itself (POSIX d_type / NFS READDIRPLUS), eliminating the per-entry stats in the common case. CPython already did this migration for os.walk(), glob, and pathlib.Path.iterdir() (gh-117727); http.server was missed.

Benchmark

Directory with 1000 files + 1000 dirs (plus a few symlinks):

  • stat-family syscalls (strace): 4088 → 88 (the 88 is constant interpreter startup; the per-entry loop drops from ~2 syscalls to ~0)
  • local filesystem wall-clock: ~10× faster
  • emulating NFS by injecting per-stat latency: the listing goes from seconds to ~2 ms

Worst case — a mount that returns DT_UNKNOWN — falls back to one cached lstat per entry, which is still fewer calls than today and never worse.

The change is behavior-preserving: DirEntry.is_dir()/is_symlink() match os.path.isdir/os.path.islink semantics (follow-symlinks behavior and return-False-on-error), verified across real dirs/files, symlink-to-dir, symlink-to-file, and broken symlinks. The existing test_httpservers suite passes unchanged.

I have a patch ready and will open a PR.


This issue was prepared with AI assistance (Claude Code); the analysis and benchmarks were reviewed by me.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytriagedThe issue has been accepted as valid by a triager.type-featureA feature request or enhancement
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions