Skip to content

[spec] Background refresh for st.cache_data and st.cache_resource#14690

Open
lukasmasuch wants to merge 3 commits intodevelopfrom
lukasmasuch/cache-bg-refresh
Open

[spec] Background refresh for st.cache_data and st.cache_resource#14690
lukasmasuch wants to merge 3 commits intodevelopfrom
lukasmasuch/cache-bg-refresh

Conversation

@lukasmasuch
Copy link
Copy Markdown
Collaborator

@lukasmasuch lukasmasuch commented Apr 8, 2026

Describe your changes

Spec renderer: https://issues.streamlit.app/spec_renderer?pr=14690

Adds a product spec for background refresh of st.cache_data and st.cache_resource.

Key proposal: Add a refresh parameter that enables expired cache entries to be refreshed in the background while returning stale data immediately. This eliminates blocking waits for users hitting expired cache entries.

@st.cache_data(ttl="1h", refresh="background")
def fetch_stock_prices(symbol: str):
    return expensive_api_call(symbol)

The spec covers:

  • API design with refresh="foreground" (default) vs refresh="background"
  • Lazy background refresh behavior (on-access, not proactive)
  • Alternative solutions considered (eager refresh, scheduled refresh, cache warming)
  • API naming alternatives

GitHub Issue Link (if applicable)

Testing Plan

  • No tests needed (spec document only)
Agent metrics
Type Name Count
skill writing-spec 1
subagent fixing-pr 2

Adds a product spec for background refresh of st.cache_data and
st.cache_resource. Proposes a new 'refresh' parameter that enables
expired cache entries to be refreshed in the background while
returning stale data immediately.

Addresses: #5871, #11050

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 8, 2026 16:12
@lukasmasuch lukasmasuch added change:spec Issue contains a product or tech spec ai-review If applied to PR or issue will run AI review workflow labels Apr 8, 2026
@snyk-io
Copy link
Copy Markdown
Contributor

snyk-io Bot commented Apr 8, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

Spec PR Validation

✅ All checks passed!

@lukasmasuch lukasmasuch changed the title [spec] Background refresh for st.cache_data and st.cache_resource [spec] Background refresh for st.cache_data and st.cache_resource Apr 8, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new product spec proposing background refresh for expired st.cache_data / st.cache_resource entries via a refresh parameter, aiming to avoid blocking users when TTL has elapsed.

Changes:

  • Introduces a product spec describing refresh="foreground" (default) vs refresh="background" behavior.
  • Documents intended semantics (stale-while-revalidate on access), validation rules, and error-handling expectations.
  • Captures API/solution alternatives and out-of-scope follow-ups.

Comment thread specs/2026-04-08-cache-background-refresh/product-spec.md Outdated
Comment thread specs/2026-04-08-cache-background-refresh/product-spec.md Outdated
Comment thread specs/2026-04-08-cache-background-refresh/product-spec.md Outdated
Comment thread specs/2026-04-08-cache-background-refresh/product-spec.md
@github-actions github-actions Bot removed the ai-review If applied to PR or issue will run AI review workflow label Apr 8, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR adds a product spec for background refresh of st.cache_data and st.cache_resource. The spec proposes a new refresh parameter ("foreground" | "background") that enables expired cache entries to be refreshed in a background thread while returning stale data immediately, eliminating blocking waits for users hitting expired entries. The change consists of a single markdown file (specs/2026-04-08-cache-background-refresh/product-spec.md) — no code, tests, or security-sensitive changes.

All three reviewers (claude-4.6-opus-high-thinking, gemini-3.1-pro, gpt-5.3-codex-high) agreed that the spec is well-structured, follows established conventions, and clearly documents the problem, proposed API, behavior, validation, examples, and alternatives. Two reviewers approved outright; one requested minor clarifications on implementation-critical details before approval.

Code Quality

N/A — this is a spec document only with no code changes. The markdown is well-formatted, follows the established template and conventions in the specs/ directory, and aligns with Streamlit's API design principles (string literals over booleans, sensible defaults, progressive disclosure).

Test Coverage

No tests are needed for this spec-only PR. All three reviewers agreed on this. For the eventual implementation PR, coverage should include:

  • Python unit tests for TTL expiration, refresh deduplication, failure cleanup, and parameter validation.
  • E2E tests validating user-visible non-blocking behavior after expiration.

Backwards Compatibility

All three reviewers confirmed the design is backwards compatible. The refresh parameter defaults to "foreground", preserving existing behavior. No existing API signatures are modified.

Security & Risk

No security concerns — this is a documentation-only change. All reviewers agreed. The main risk identified (by gpt-5.3-codex-high and claude-4.6-opus-high-thinking) is implementation-time: cache refresh concurrency, resource lifecycle handling for cache_resource, and thread pool management. These are appropriately deferred to the tech spec and implementation phases.

External test recommendation

  • Recommend external_test: No
  • Triggered categories: None
  • Evidence:
    • specs/2026-04-08-cache-background-refresh/product-spec.md: New markdown file only; no code, routing, auth, WebSocket, embedding, asset serving, CORS, SiS runtime, storage, or security header changes.
  • Suggested external_test focus areas: None at this stage. Re-assess on the implementation PR if it touches cache runtime behavior across hosted/embedded environments.
  • Confidence: High (all three reviewers agreed unanimously)
  • Assumptions and gaps: None. This is a spec-only PR.

Accessibility

N/A — no frontend changes. All reviewers agreed.

Recommendations

  1. Clarify cache_resource replacement semantics (raised by 2 of 3 reviewers): The spec should explicitly address the lifecycle of replaced resources when refresh="background" is used with cache_resource. Callers may hold live references to the old resource object at the moment it's swapped. While this is consistent with current foreground TTL eviction behavior, background refresh makes concurrent stale-reference usage more likely. This can be addressed here with a brief note or deferred explicitly to the tech spec.

  2. Document thread pool bounds and backpressure (raised by 2 of 3 reviewers): The spec mentions concurrent.futures but doesn't discuss thread pool sizing. Many distinct cache keys expiring simultaneously could spawn many background threads. The spec should note whether a shared bounded thread pool is planned, and what happens when the pool is exhausted (queue? fall back to foreground?).

  3. Verify st.App lifecycle hooks reference (raised by 1 reviewer): Line 317 references st.App lifecycle hooks (on_app_start) for cache warming, but this may not be part of the current public API. The reference should either link to shipped documentation or be rephrased as a future capability.

  4. Confirm SiS/Snowflake threading compatibility (raised by 1 reviewer): The checklist claims SiS compatibility via concurrent.futures, but some managed runtimes may restrict thread creation. A brief note confirming compatibility or documenting graceful degradation would strengthen the claim.

  5. Consider adding a background failure retry/backoff note (raised by 1 reviewer): On background refresh failure, the expired entry is evicted and the next caller gets a foreground cache miss. Repeated failures (e.g., a flaky API) could cause a "thundering herd" of foreground refreshes. A note about whether any backoff/cooldown mechanism is planned would be valuable.

Items 1-2 had cross-reviewer agreement and are the highest priority. Items 3-5 are single-reviewer suggestions that would improve spec completeness. All are addressable as spec amendments or can be explicitly deferred to the tech spec — none are blocking for a product spec PR.

Verdict

APPROVED: Well-written product spec that follows established conventions, proposes a clean and backwards-compatible API, and thoroughly documents alternatives. Two of three reviewers approved; the third requested minor clarifications that are appropriate for a tech spec follow-up rather than blockers on a product spec. No critical issues were identified by any reviewer.


Consolidated from reviews by claude-4.6-opus-high-thinking, gemini-3.1-pro, and gpt-5.3-codex-high. Consolidation by claude-4.6-opus-high-thinking.

This review also includes 5 inline comment(s) on specific code lines.

Comment thread specs/2026-04-08-cache-background-refresh/product-spec.md Outdated
Comment thread specs/2026-04-08-cache-background-refresh/product-spec.md
Comment thread specs/2026-04-08-cache-background-refresh/product-spec.md Outdated
Comment thread specs/2026-04-08-cache-background-refresh/product-spec.md Outdated
Comment thread specs/2026-04-08-cache-background-refresh/product-spec.md Outdated
lukasmasuch and others added 2 commits April 8, 2026 18:33
- Clarify lazy TTL expiration semantics (entries treated as expired on access)
- Add note on cache_resource replacement semantics and object lifecycle
- Document bounded ThreadPoolExecutor concurrency for background refreshes
- Add deduplication scope clarification (per-process in multi-worker setups)
- Document persist mode incompatibility with background refresh
- Fix st.App lifecycle hooks reference (doesn't exist in current API)
- Update checklist table header to match spec template
- Add SiS/Snowflake threading compatibility note with graceful fallback

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clarify that parameter usage is tracked via gather_metrics decorator.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@lukasmasuch lukasmasuch added the ai-review If applied to PR or issue will run AI review workflow label Apr 8, 2026
@github-actions github-actions Bot removed the ai-review If applied to PR or issue will run AI review workflow label Apr 8, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR adds a product spec for background refresh of st.cache_data and st.cache_resource (specs/2026-04-08-cache-background-refresh/product-spec.md). It proposes a new refresh parameter with values "foreground" (default, preserving current behavior) and "background" (returns stale data immediately on TTL expiry while refreshing in a background thread). No code, frontend, protobuf, or test changes are included — this is a spec document only.

All three reviewers (claude-4.6-opus-high-thinking, gemini-3.1-pro, gpt-5.3-codex-high) unanimously approved this PR with no blocking issues.

Code Quality

The spec is well-structured and follows the established template in specs/YYYY-MM-DD-template/product-spec.md. All reviewers agreed it aligns with Streamlit's API design principles, particularly #16 (Prefer Enums Over Booleans), #4 (Start Minimal), #18 (Extend Before Inventing), and #26 (Minimize Migration Distance).

Strengths highlighted across reviews:

  • Problem-first approach: Links concrete GitHub issues (#5871, #11050) with real user pain points
  • Progressive examples: Shows basic usage through increasingly complex scenarios
  • API alternatives: Presents three options with clear tradeoffs and a well-reasoned recommendation
  • Alternative solutions: Thoroughly evaluates eager refresh, scheduled refresh, and cache warming
  • Validation rules: Clearly documents valid/invalid combinations including persist interaction

Test Coverage

Not applicable — this is a spec-only PR. All reviewers agreed no tests are needed. For the implementation follow-up, reviewers recommend Python unit tests for cache refresh state transitions, error handling, and concurrency semantics, plus E2E tests for user-facing behavior around expired entries.

Backwards Compatibility

All reviewers confirmed full backwards compatibility: the refresh parameter defaults to "foreground", preserving current behavior. No existing parameters are modified, and existing code without the refresh parameter continues to work identically.

Security & Risk

No security concerns. The PR contains only a markdown document. The proposed feature (background ThreadPoolExecutor for cache refresh) uses stdlib threading with no new network endpoints, authentication changes, or user data handling. One reviewer (gemini-3.1-pro) correctly noted that thread-safety and resource exhaustion should be evaluated during implementation.

External test recommendation

  • Recommend external_test: No
  • Triggered categories: None
  • Evidence:
    • specs/2026-04-08-cache-background-refresh/product-spec.md: New markdown file only, no executable code paths changed
  • Suggested external_test focus areas: N/A for this spec PR. When the feature is implemented, external test coverage should validate cache refresh behavior in externally hosted and embedded contexts (especially TTL expiration under concurrent traffic).
  • Confidence: High
  • Assumptions and gaps: None — all three reviewers independently confirmed this assessment.

Accessibility

Not applicable — no frontend changes.

Recommendations

Non-blocking suggestions consolidated from all reviewers:

  1. show_spinner interaction (claude, gemini agreed implicitly): Document whether the spinner is suppressed when refresh="background" returns stale data immediately. This is an important UX detail for the tech spec and docs.
  2. max_entries interaction (claude): Address what happens when max_entries is set and the cache is full while a background refresh is in progress — could a just-refreshed entry be immediately evicted?
  3. .clear() race condition (claude): The tech spec should address what happens if st.cache_data.clear() or func.clear() is called while a background refresh is in progress.
  4. Failure degradation path (claude): Consider whether keeping the stale entry for a grace period (rather than immediate eviction on failure) would provide better degradation when the underlying service is persistently down.
  5. Cross-worker deduplication (gemini): Explicitly note whether cross-worker deduplication is planned for the future, even if out of scope for MVP.
  6. Implementation test plan (gpt): The implementation PR should explicitly test and document concurrency/failure semantics (single-flight per key, eviction on refresh failure, and cache_resource reference lifecycle).

Verdict

APPROVED: All three reviewers unanimously approved this well-written, comprehensive product spec. It follows established conventions, presents clear API alternatives with sound reasoning, and proposes a backwards-compatible, extensible API design. No blocking issues were identified. The spec is ready for team review and discussion.


This is a consolidated automated AI review by claude-4.6-opus-high-thinking, synthesizing reviews from claude-4.6-opus-high-thinking, gemini-3.1-pro, and gpt-5.3-codex-high.

This review also includes 4 inline comment(s) on specific code lines.


| Item | ✅ or comment |
|----------------------------|----------------------------------------------------------------------|
| Works on SiS, Cloud, etc? | ✅ Uses standard Python threading (`concurrent.futures.ThreadPoolExecutor`). SiS/Snowflake environments support stdlib threading; if thread creation is restricted, refreshes will execute synchronously in the foreground as a graceful fallback. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider adding show_spinner interaction to the checklist or behavior section. When refresh="background", the stale value is returned immediately — presumably the spinner should not be shown. This is an important UX detail that the tech spec and docs will need to address.

consistent with current foreground TTL behavior where a resource can be evicted
while still in use. The old resource is not explicitly disposed; callers holding
references continue using it until they release it.
- **Failure:** Log warning, evict the expired entry
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: On failure, the spec says "evict the expired entry." If the underlying service is down for an extended period, users will alternate between stale data (triggering background refresh) and foreground errors (after eviction). Consider whether keeping the stale entry for a grace period or adding retry backoff would provide a better degradation path. Could be out of scope for MVP but worth noting as a future consideration.

Concurrent requests for the same expired key all receive stale data while a single
background refresh runs. Deduplication is per-process; in multi-worker deployments,
each worker independently detects expiration and triggers its own background refresh.
- **Bounded concurrency**: Background refreshes use a shared bounded `ThreadPoolExecutor`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: The bounded ThreadPoolExecutor is a good design choice. Worth considering in the tech spec: what happens when the pool is exhausted and many keys expire simultaneously? Should it degrade to foreground refresh when the pool is full (rather than queueing), and should pool size be configurable?

stored on disk and currently do not respect `ttl` for eviction. Using
`refresh="background"` with `persist` mode will raise a `StreamlitAPIException` since
background refresh requires TTL-based expiration. Users needing both persistence and
background refresh should use `persist=False` (the default) with `refresh="background"`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: The persist + refresh="background" incompatibility is well-documented, but if disk-persisted caches support TTL-based expiration in the future, this StreamlitAPIException would need to be revisited. Might it be better to document this as a current limitation rather than a hard validation error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:spec Issue contains a product or tech spec

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants