Skip to content

perf: Replace dict-based request ID management with list-indexed structure #808

@mykaul

Description

@mykaul

Summary

The per-connection request ID management uses deque(range(N)) for the free ID pool and dict{int → (cb, decoder, result_metadata)} for in-flight request tracking. Replacing these with list-based structures yields measurable per-request savings and significant memory reduction.

Current Architecture (connection.py)

Structure Type Purpose
request_ids deque(range(300)) Pool of available stream IDs
_requests dict{int → tuple} Maps in-flight stream IDs to callbacks
orphaned_request_ids set() Timed-out stream IDs awaiting late responses

Proposed Change

  • request_ids: dequelist (used as stack with pop()/append())
  • _requests: dictlist[tuple|None] (indexed by stream ID, None = free)
  • orphaned_request_ids: keep as set() (rarely used)

Benchmark Results (CPython 3.14)

Full request cycle (get ID + store request + retrieve request + return ID):

Approach ns/op Memory (300 IDs)
Current (deque + dict) 72.3 ns ~20.6 KB
Proposed (list + list) 47.1 ns ~4.9 KB
Saving 25.2 ns (35%) ~15.7 KB (76%)

Per-operation breakdown (dict vs list for _requests):

Operation dict list Saving
Store request 43.7 ns 13.1 ns 30.1 ns
Retrieve request 43.2 ns 13.1 ns 30.1 ns

Key Implementation Concerns

  1. error_all_requests() (connection.py:1143): Currently does requests = self._requests; self._requests = {} (atomic swap). With a list, this becomes swap + allocate new [None]*size, or iterate+clear.
  2. _requests.pop(stream_id) with KeyError (connection.py:1407, cluster.py:4509): Needs conversion to if lst[stream_id] is None check.
  3. requests.popitem()[1] (connection.py:1162): Needs to find a non-None entry — requires iteration or count tracking.
  4. not self._requests truthiness check (asyncorereactor.py:458): Needs a separate _requests_count tracker or any() call.
  5. Dynamic sizing: Start at 300, grow list when highest_request_id exceeds current size (matching existing deque growth pattern).

Files to Modify

  • cassandra/connection.py (~15 lines)
  • cassandra/cluster.py (~3 lines)
  • cassandra/io/asyncorereactor.py (~1 line)
  • Tests that inspect _requests as a dict

Impact

Related: #536

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions