Skip to content

RedisSentinel: native multi-host failover (refs #2819)#2832

Draft
nikamodis wants to merge 4 commits into
phpredis:developfrom
nikamodis:feature/sentinel-multihost
Draft

RedisSentinel: native multi-host failover (refs #2819)#2832
nikamodis wants to merge 4 commits into
phpredis:developfrom
nikamodis:feature/sentinel-multihost

Conversation

@nikamodis
Copy link
Copy Markdown

Summary

Adds native multi-host support to RedisSentinel via a new hosts constructor option, eliminating the need for userland workarounds such as namoshek/laravel-redis-sentinel for the Sentinel discovery portion of the problem.

Partially addresses #2819. Related prior discussion: #2132.

API

$sentinel = new RedisSentinel([
    'hosts' => [
        ['host' => '10.0.0.1', 'port' => 26379],
        ['host' => '10.0.0.2', 'port' => 26379],
        ['host' => '10.0.0.3', 'port' => 26379],
    ],
    'connectTimeout' => 0.1,
    'auth' => 'secret',
]);

// Auto-falls-back to the next Sentinel on network failure.
$master = $sentinel->getMasterAddrByName('mymaster');

Semantics

  • Zero BC risk. When hosts is not provided, RedisSentinel behaves exactly as before.
  • Sticky connection. First reachable host is used for all subsequent calls. current_host_idx advances only on failure.
  • One command-level retry per call. Inside sentinel_try_next_host, a bounded linear scan iterates remaining hosts; at most one command re-execution.
  • Skipped hosts stay skipped for the instance lifetime.
  • Network error detection inspects RedisSock state (status, stream) — no exception message parsing.
  • Validation errors (empty hosts, missing host, wrong types, >1024 entries) throw RedisException, consistent with the rest of phpredis.
  • Exhaustion throws RedisException mentioning the host count and last attempted endpoint.

Not in this PR (deferred)

To keep scope reviewable, the following are deferred to a follow-up:

  • Auto-reconnect of Redis (not RedisSentinel) connections on master failover
  • Configurable retry policy (retryAttempts, retryDelay)
  • Command-level retry on the Redis class driven by Sentinel re-query

The supporting infrastructure introduced here (RedisSock->sentinel_hosts, sentinel_try_next_host) is reusable for that follow-up.

Implementation notes

  • Host list is stored on RedisSock as three new fields (sentinel_hosts, sentinel_hosts_count, sentinel_current_host_idx). sentinel_host_entry is forward-declared in common.h with the full struct in sentinel_library.h. This avoids touching shared layout in redis_object / Redis / RedisCluster.
  • All 11 RedisSentinel methods wrap their REDIS_PROCESS_KW_CMD call in a SENTINEL_METHOD macro that implements the retry. No-op when hosts == NULL (single-host usage is identical to today).
  • Exception handling is careful to clear EG(exception) between per-host connection attempts so a failing intermediate host does not leak state into a successful retry.
  • Built and tested against PHP 8.1–8.4; follows the existing REDIS_THROW_EXCEPTION / zend_throw_exception_ex(redis_exception_ce, …) convention rather than zend_value_error / zend_type_error (which are PHP 8+ only).

Testing

  • 17 integration tests in tests/RedisSentinelMultiHostTest.php (covers construction, fallback, validation, sticky behavior, bounded retries, DoS guard).
  • Docker-compose cluster in tests/sentinel-multihost/ (1 master + 2 replicas + 3 Sentinels). See the README.md in that directory for local usage.
  • New CI job sentinel-multihost in .github/workflows/ci.yml runs the integration tests across PHP 8.1–8.4.
  • Tests mark themselves skipped when the Sentinel env is unreachable, so existing local runs are unaffected.
  • Valgrind verified: zero new leaks across 200 iterations × all paths (baseline leak count unchanged).

Open questions

Happy to adjust any of these based on maintainer preference:

  1. Option naminghosts as a top-level option vs. allowing host to accept string|array. I went with a separate option to avoid type ambiguity for static analyzers; the alternative is a few-line change if preferred.
  2. Retry configurability — currently "exhaust remaining list, at most one retry per command call". Should retryAttempts / retryDelay be added in this PR, or is that more appropriate for the Phase 2 follow-up?
  3. Host rehydration — skipped hosts are permanently skipped for the instance lifetime. Periodic revisit (or reset-on-demand) could be added if real usage shows demand.
  4. Phase 2 shape — should Redis auto-reconnect via Sentinel live on the existing Redis class or in a new wrapper (e.g. RedisSentinelPool)? Not in scope here, but worth flagging early for discussion.

Opened as Draft to invite design discussion before marking ready for review.

modestas_sienauskas added 4 commits April 22, 2026 11:37
Adds a 'hosts' array option to RedisSentinel::__construct so the client
transparently falls back to the next Sentinel endpoint on network failure,
eliminating the need for userland workarounds in HA deployments.

  $sentinel = new RedisSentinel(['hosts' => [
      ['host' => '10.0.0.1', 'port' => 26379],
      ['host' => '10.0.0.2', 'port' => 26379],
      ['host' => '10.0.0.3', 'port' => 26379],
  ]]);

Semantics:
- Sticky connection: the first reachable host is used until it fails.
- One command-level retry per call; a bounded linear scan inside
  sentinel_try_next_host iterates remaining hosts on a failed attempt.
- Skipped hosts are not revisited for the instance lifetime.
- Network error detection inspects RedisSock state (status, stream),
  not exception message strings.
- Zero BC risk: when 'hosts' is absent, behavior is identical to today.

Host list is stored on RedisSock as three new fields; sentinel_host_entry
is forward-declared in common.h with the full struct in sentinel_library.h
so redis_object / Redis / RedisCluster layouts are untouched.

All 11 RedisSentinel methods wrap their REDIS_PROCESS_KW_CMD call in a
new SENTINEL_METHOD macro that implements the retry. No-op on single-host.

Refs phpredis#2819
- Stub phpdoc on __construct describes the 'hosts' option alongside
  the existing single-host parameters.
- sentinel.md gains a 'Multi-host support' section covering API,
  semantics (sticky, bounded retry, no rehydration), and error
  handling (RedisException on validation and exhaustion).
- Regenerated arginfo reflects only the new stub hash; method
  signatures are unchanged.

Refs phpredis#2819
Adds 17 integration tests (tests/RedisSentinelMultiHostTest.php) covering:
- Construction with 'hosts' array
- Connect-time fallback through dead hosts
- Exhaustion throwing RedisException with host count in message
- Validation errors (empty, missing 'host' key, wrong types, oversized)
- DoS guard (>1024 hosts rejected)
- Default port (26379) when omitted
- Single-host BC path unchanged
- Auth propagation (skipped unless SENTINEL_AUTH_PASS is set)
- Sticky behavior verified via call-time comparison
- Bounded retry elapsed-time check

Tests mark themselves skipped when the local Sentinel env isn't
reachable, so existing local test runs are unaffected.

The env itself (tests/sentinel-multihost/) is a minimal docker-compose
cluster with 1 master + 2 replicas + 3 Sentinels on ports 26379/80/81.

Refs phpredis#2819
Runs the RedisSentinelMultiHostTest integration tests across PHP 8.1-8.4
against the docker-compose cluster in tests/sentinel-multihost/.

- docker compose up, wait for Sentinels via 'nc -z'
- phpize + configure --enable-redis + make
- php tests/TestRedis.php --class redissentinelmultihost
- Dumps docker logs on failure for triage
- Tears down the cluster in all outcomes

Refs phpredis#2819
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant