fix: prevent replicas from restoring old-timeline WAL segments by IgorOhrimenko · Pull Request #10394 · cloudnative-pg/cloudnative-pg

IgorOhrimenko · 2026-03-31T15:43:06Z

After a switchover or failover, the WAL archive (S3/object storage) still contains segments from previous timelines. When a replica restarts with existing PVC data (e.g. after a rolling restart triggered by parameter change), its restore_command can fetch old-timeline WAL segments from the archive before streaming replication reconnects to the new primary.

This causes the replica's timeline history to diverge from the current primary, resulting in either:

CrashLoopBackOff:

LOG: restored log file "00000002.history" from archive
LOG: restored log file "00000003.history" from archive
LOG: restored log file "00000001000000000000004B" from archive   ← timeline 1!
LOG: restored log file "000000010000000000000048" from archive   ← timeline 1!
LOG: entering standby mode
FATAL: requested timeline 3 is not a child of this server's history
LOG: startup process (PID 34) exited with exit code 1
LOG: database system is shut down

Stuck in "Standby (file based)" mode:

LOG: restored log file "0000000100000032000000E8" from archive   ← timeline 1!
LOG: invalid record length at 32/E80000A0: expected at least 24, got 0
LOG: fetching timeline history file for timeline 2 from primary server
FATAL: could not receive timeline history file from the primary server: ERROR:
  could not open file "pg_wal/00000002.history": No such file or directory
LOG: waiting for WAL to become available at 32/E80000B8

Root cause

validateTimelineHistoryFile() in walrestore/cmd.go validates .history files but does not check regular WAL segments. A replica can successfully download 000000010000000000000048 (timeline 1) from the archive when the primary is already on timeline 3.

Fix

Added validateWALSegmentTimeline() which rejects WAL segments whose timeline is older than the cluster's current timeline, for replica instances in established clusters. This forces PostgreSQL to fall back to streaming replication from the current primary instead of consuming stale WAL from the archive.

The check is skipped when CurrentPrimary is not set (during bootstrap or PITR recovery) to allow fetching WAL from any timeline.

Reproduction

Minimal reproducer using kind + MinIO: reproduce.sh

Create a 3-instance cluster with synchronous replication and WAL archiving to MinIO
Write data, take backup, write more data
Change shared_buffers to trigger a rolling restart with switchover
Replica restarts with existing PVC, restore_command fetches old-timeline WAL from MinIO
Replica crashes or gets stuck in file-based mode

In the provided test environment (kind, single-node, local MinIO, synchronous replication), the bug reproduces consistently. In multi-node production clusters the issue is intermittent, depending on the timing between switchover completion and replica reconnection.

Without the fix: replica enters CrashLoopBackOff or Standby (file based).
With the fix: old WAL is rejected with warning log, replica uses streaming replication:

WARNING: Refusing to restore old-timeline WAL segment for replica
  walName=000000010000000000000048 walTimeline=1 clusterTimeline=3

Testing

Unit tests added for validateWALSegmentTimeline covering all branches
Verified recovery from S3 backup works correctly (PITR not broken)
Verified rolling restarts with switchover work without file-based or CrashLoop

Closes #4990
Related: #4188, #3344

github-actions · 2026-03-31T15:43:25Z

❗ By default, the pull request is configured to backport to all release branches.

To stop backporting this pr, remove the label: backport-requested ◀️ or add the label 'do not backport'
To stop backporting this pr to a certain release branch, remove the specific branch label: release-x.y

After a switchover or failover, the WAL archive still contains segments from previous timelines. When a replica restarts with existing PVC data, its restore_command can fetch these old-timeline WAL segments from the archive, causing the replica's timeline history to diverge from the current primary. This results in either: - CrashLoopBackOff: "requested timeline N is not a child of this server's history" - Replica stuck in "Standby (file based)" mode, unable to stream The existing validateTimelineHistoryFile only checks .history files. This commit adds validateWALSegmentTimeline which also rejects regular WAL segments whose timeline is older than the cluster's current timeline, for replicas in established clusters. The check is skipped when CurrentPrimary is not set (during bootstrap or PITR recovery) to allow fetching WAL from any timeline. Closes: cloudnative-pg#4990 Signed-off-by: Igor Ohrimenko <igor.ohrimenko@travelata.ru>

gbartolini · 2026-04-01T14:56:47Z

Hi @IgorOhrimenko. What if I request a particular timeline for the recovery? I believe that we should let Postgres handle that process, not CNPG. What are your thoughts?

IgorOhrimenko · 2026-04-03T12:49:26Z

Hi @gbartolini, thanks for raising this important architectural concern.
You're absolutely right that timeline management should ideally be handled by PostgreSQL itself, not by CNPG. The restore_command should be a simple fetcher without business logic.
However, this specific issue has been blocking CNPG adoption in production for years. Many teams hesitate to deploy CNPG because replicas can crash-loop or get stuck in "standby (file based)" mode after switchovers. The problem is that PostgreSQL's recovery algorithm can fetch old-timeline WAL segments before reading the .history files that would tell it those segments are incompatible.
Our fix in CNPG is a pragmatic workaround:

It only applies to replicas in established clusters (not during bootstrap/PITR, not for primaries)
It returns ErrWALNotFound to force fallback to streaming replication
It prevents data divergence and crash loops that PostgreSQL would otherwise hit later

The reproduction script in this PR shows the issue occurs consistently. We've already patched our production clusters with this fix (built custom binaries/Docker images), and it solves the problem immediately.
Long-term, PostgreSQL should indeed handle this better—perhaps by checking timeline compatibility earlier or prioritizing streaming replication more aggressively after failovers. But getting changes into PostgreSQL core takes years. Meanwhile, operators need stable clusters today.
If we merge this fix, we could add a code comment like:

// TODO: Remove this validation when PostgreSQL handles old-timeline
// WAL rejection during replica recovery (see PostgreSQL issue #XYZ).

Should we open a PostgreSQL issue to track this?

IgorOhrimenko requested review from a team, NiccoloFei, jsilvela and litaocdl as code owners March 31, 2026 15:43

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Mar 31, 2026

cnpg-bot added backport-requested ◀️ This pull request should be backported to all supported releases release-1.25 release-1.27 release-1.28 labels Mar 31, 2026

dosubot bot added the bug 🐛 Something isn't working label Mar 31, 2026

IgorOhrimenko marked this pull request as draft March 31, 2026 15:52

IgorOhrimenko force-pushed the fix/validate-wal-segment-timeline branch 2 times, most recently from ff838b3 to 7c37404 Compare March 31, 2026 16:17

IgorOhrimenko marked this pull request as ready for review March 31, 2026 16:27

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Mar 31, 2026

IgorOhrimenko force-pushed the fix/validate-wal-segment-timeline branch from a9319d5 to e3eef93 Compare April 1, 2026 09:32

mnencia added release-1.29 and removed release-1.27 labels Apr 1, 2026

dosubot bot mentioned this pull request Apr 14, 2026

[Bug]: wal-restore refuses timeline history files during PITR recovery, breaking cross-timeline recovery #10442

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent replicas from restoring old-timeline WAL segments#10394

fix: prevent replicas from restoring old-timeline WAL segments#10394
IgorOhrimenko wants to merge 1 commit intocloudnative-pg:mainfrom
IgorOhrimenko:fix/validate-wal-segment-timeline

IgorOhrimenko commented Mar 31, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

gbartolini commented Apr 1, 2026

Uh oh!

IgorOhrimenko commented Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

IgorOhrimenko commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root cause

Fix

Reproduction

Testing

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

gbartolini commented Apr 1, 2026

Uh oh!

IgorOhrimenko commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

IgorOhrimenko commented Mar 31, 2026 •

edited

Loading

IgorOhrimenko commented Apr 3, 2026 •

edited

Loading