Skip to content

feat(pubsub): implement streaming keep-alive logic#34653

Draft
torreypayne wants to merge 1 commit into
mainfrom
pubsub-streaming-keepalive
Draft

feat(pubsub): implement streaming keep-alive logic#34653
torreypayne wants to merge 1 commit into
mainfrom
pubsub-streaming-keepalive

Conversation

@torreypayne

Copy link
Copy Markdown
Member

Overview

Implements proactive streaming keep-alive logic and connection health monitoring in Google::Cloud::PubSub::MessageListener::Stream, mirroring the design implemented in the .NET Pub/Sub client (dotnet#15649).

Long-running bi-directional gRPC streaming pull connections (StreamingPull) can experience silent TCP drops, intermediary network timeouts, or read deadlocks during periods of low message volume. This change introduces background timer tasks to push regular keep-alive requests and actively monitor server Pong timestamps.

Key Changes

  • Protocol Version Initialization: Explicitly initializes protocol_version = 1 on the initial StreamingPullRequest protobuf to enable bi-directional stream keep-alive support.
  • Unconditional Keep-Alive Pings: Configures a background timer task (@stream_keepalive_task) to dispatch empty StreamingPullRequest pings at regular intervals (default 30 seconds), regardless of current lease inventory volume.
  • Pong Monitoring & Automatic Reconnection: Introduces @pong_monitor_task to inspect timestamps (@last_ping_at, @last_pong_at). If a keep-alive response is overdue by more than pong_deadline seconds (default 15 seconds), the monitor raises RestartStream to safely recycle the connection and back off.
  • Concurrency Timestamp Guard: Guards ping timestamp updates (@last_ping_at = now if @last_pong_at >= @last_ping_at) to ensure consecutive un-ponged pings cannot overwrite the timestamp of an overdue request.

Testing & Validation

  • Unit Test Suite (keepalive_test.rb): Added targeted unit test coverage asserting protocol version flags, timer intervals, deadline timeouts, and non-disruptive Pong handling.
  • Resiliency & Robustness Suite: Validated against live GCP test instances (helical-zone-771) across simulated TCP socket hangs, sub-millisecond deadline starvation, and post-recovery downstream message delivery.

@torreypayne torreypayne force-pushed the pubsub-streaming-keepalive branch from f495005 to cf5df9b Compare June 22, 2026 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants