refreshDataParts catch block missing scheduleAfter causes permanent task death on transient error

_Found via ClickGap automated review. Please close or comment if this is incorrect or needs adjustment._

_Retrospective finding from a historical scan of [PR #76467](https://github.com/ClickHouse/ClickHouse/pull/76467) (merged 2025-04-11). Confirmed on current codebase — close with a note if already fixed._

### Describe what's wrong

If any exception occurs during MergeTreeData::refreshDataParts (e.g., temporary disk unavailability, network timeout), the background refresh task permanently stops and never runs again, making the readonly table stale forever

**Root cause:** MergeTreeData.cpp:2717-2720: catch block missing refresh_parts_task->scheduleAfter(interval_milliseconds) — compare with refreshStatistics at lines 2759-2766 which has this correctly

**Why we believe this is a bug:** MergeTreeData::refreshDataParts (MergeTreeData.cpp:2621) uses function-try-catch pattern. The scheduleAfter call is at line 2715 inside the try block. The catch block (lines 2717-2720) only logs the error but does NOT reschedule the task, unlike refreshStatistics (lines 2759-2766) which properly reschedules in its catch block.

**Affected locations:**
- `src/Storages/MergeTree/MergeTreeData.cpp:2717` — catch block of refreshDataParts missing scheduleAfter

**Impact:** A single transient error (network hiccup, temporary disk unavailability) permanently disables background data refresh for readonly MergeTree tables. The table becomes stale until server restart.

### Does it reproduce on most recent release?

Likely yes — see testability note in additional context.

### How to reproduce

_This is a code-level bug identified through source analysis. See root cause and affected locations above for the specific code paths involved._

### Expected behavior

_The code should not exhibit the behavior described in the root cause above._

### Error message and/or stacktrace

_See root cause description above._

### Additional context

**Open risks:**
- Any code path that throws in refreshDataParts triggers this — disk iteration, part loading, part commit all can throw

**Suggested fix:** Add refresh_parts_task->scheduleAfter(interval_milliseconds) to the catch block, matching the pattern used in refreshStatistics at line 2762-2763

**Analysis details:** Confidence HIGH | Severity P1 | Testability: `THEORETICAL`

Found during automated review of [PR #76467](https://github.com/ClickHouse/ClickHouse/pull/76467).

---
_ClickGapAI · Confidence: HIGH · Severity: P1 · Finding: `h_pr76467_001`_


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refreshDataParts catch block missing scheduleAfter causes permanent task death on transient error #102045

Describe what's wrong

Does it reproduce on most recent release?

How to reproduce

Expected behavior

Error message and/or stacktrace

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

refreshDataParts catch block missing scheduleAfter causes permanent task death on transient error #102045

Description

Describe what's wrong

Does it reproduce on most recent release?

How to reproduce

Expected behavior

Error message and/or stacktrace

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions