Found via ClickGap automated review. Please close or comment if this is incorrect or needs adjustment.
Retrospective finding from a historical scan of PR #76467 (merged 2025-04-11). Confirmed on current codebase — close with a note if already fixed.
Describe what's wrong
If any exception occurs during MergeTreeData::refreshDataParts (e.g., temporary disk unavailability, network timeout), the background refresh task permanently stops and never runs again, making the readonly table stale forever
Root cause: MergeTreeData.cpp:2717-2720: catch block missing refresh_parts_task->scheduleAfter(interval_milliseconds) — compare with refreshStatistics at lines 2759-2766 which has this correctly
Why we believe this is a bug: MergeTreeData::refreshDataParts (MergeTreeData.cpp:2621) uses function-try-catch pattern. The scheduleAfter call is at line 2715 inside the try block. The catch block (lines 2717-2720) only logs the error but does NOT reschedule the task, unlike refreshStatistics (lines 2759-2766) which properly reschedules in its catch block.
Affected locations:
src/Storages/MergeTree/MergeTreeData.cpp:2717 — catch block of refreshDataParts missing scheduleAfter
Impact: A single transient error (network hiccup, temporary disk unavailability) permanently disables background data refresh for readonly MergeTree tables. The table becomes stale until server restart.
Does it reproduce on most recent release?
Likely yes — see testability note in additional context.
How to reproduce
This is a code-level bug identified through source analysis. See root cause and affected locations above for the specific code paths involved.
Expected behavior
The code should not exhibit the behavior described in the root cause above.
Error message and/or stacktrace
See root cause description above.
Additional context
Open risks:
- Any code path that throws in refreshDataParts triggers this — disk iteration, part loading, part commit all can throw
Suggested fix: Add refresh_parts_task->scheduleAfter(interval_milliseconds) to the catch block, matching the pattern used in refreshStatistics at line 2762-2763
Analysis details: Confidence HIGH | Severity P1 | Testability: THEORETICAL
Found during automated review of PR #76467.
ClickGapAI · Confidence: HIGH · Severity: P1 · Finding: h_pr76467_001
Found via ClickGap automated review. Please close or comment if this is incorrect or needs adjustment.
Retrospective finding from a historical scan of PR #76467 (merged 2025-04-11). Confirmed on current codebase — close with a note if already fixed.
Describe what's wrong
If any exception occurs during MergeTreeData::refreshDataParts (e.g., temporary disk unavailability, network timeout), the background refresh task permanently stops and never runs again, making the readonly table stale forever
Root cause: MergeTreeData.cpp:2717-2720: catch block missing refresh_parts_task->scheduleAfter(interval_milliseconds) — compare with refreshStatistics at lines 2759-2766 which has this correctly
Why we believe this is a bug: MergeTreeData::refreshDataParts (MergeTreeData.cpp:2621) uses function-try-catch pattern. The scheduleAfter call is at line 2715 inside the try block. The catch block (lines 2717-2720) only logs the error but does NOT reschedule the task, unlike refreshStatistics (lines 2759-2766) which properly reschedules in its catch block.
Affected locations:
src/Storages/MergeTree/MergeTreeData.cpp:2717— catch block of refreshDataParts missing scheduleAfterImpact: A single transient error (network hiccup, temporary disk unavailability) permanently disables background data refresh for readonly MergeTree tables. The table becomes stale until server restart.
Does it reproduce on most recent release?
Likely yes — see testability note in additional context.
How to reproduce
This is a code-level bug identified through source analysis. See root cause and affected locations above for the specific code paths involved.
Expected behavior
The code should not exhibit the behavior described in the root cause above.
Error message and/or stacktrace
See root cause description above.
Additional context
Open risks:
Suggested fix: Add refresh_parts_task->scheduleAfter(interval_milliseconds) to the catch block, matching the pattern used in refreshStatistics at line 2762-2763
Analysis details: Confidence HIGH | Severity P1 | Testability:
THEORETICALFound during automated review of PR #76467.
ClickGapAI · Confidence: HIGH · Severity: P1 · Finding:
h_pr76467_001