fix(providers): correct pricing, deprecations, and capabilities across model catalog by waleedlatif1 · Pull Request #4990 · simstudioai/sim

waleedlatif1 · 2026-06-12T02:19:32Z

Summary

Two-pass validation of every static model entry in models.ts (~160 models, 12 providers + embedding/rerank pricing) against live provider docs, with secondary pricing cross-checks. Round 2 re-verified every field with a second fleet of per-provider agents, each producing a justification log covering every field, its source URL, and what was changed vs deliberately left alone (kept as a local decision log, not committed).

Round 1 — pricing/deprecation sweep

grok-4.20 trio price cut ($2/$6 → $1.25/$2.50), DeepSeek V4 repricing ($0.14/$0.28, 1M ctx), Gemini Deep Research output 6x fix ($2 → $12), Bedrock Mistral Large 3 4x fix ($2/$6 → $0.50/$1.50)
~30 retired/redirected models flagged deprecated (xAI May-15 batch, Gemini 2.0 shutdown, Anthropic 4.0 retirements, Mistral/Cerebras/Groq/Bedrock lifecycle)
Context-window and capability corrections (sonnet-4-5/4-0 1M → 200k, invalid minimal/xhigh effort values, dead nativeStructuredOutputs on opus-4-1, renamed shut-down preview ids)

Round 2 — full re-verification + remaining fixes

Bedrock geo-profile bug fixed in code: getBedrockInferenceProfileId no longer prefixes the 11 models whose AWS cards say geo inference is not supported (Mistral text models, Cohere, Titan) — these previously produced invalid model IDs at runtime. Unit tests added.
Bedrock pricing verified via the AWS Pricing API (marketing page is unfetchable): Nova 2 Pro/Lite, Llama 3.x/4 family, Mistral Large 2407, Ministral 3 corrections; cache-read rates added for Claude 4.x and Nova; maxOutputTokens from model cards (opus-4-1 32000 per card)
azure/gpt-5.1-codex un-deprecated — round-1 stopgap was based on a wrong premise; the azure provider defaults to the Responses API, so the model works. Azure 'none' effort dropped from the 5.4 family (Azure docs enumerate 'none' support exhaustively); azure/gpt-4o deprecated (retires 2026-10-01)
gpt-5.5-pro: undocumented verbosity removed and effort values aligned to documented pro-tier set (medium/high/xhigh); o3/o3-mini/o1/gpt-4.1-nano deprecated per the OpenAI deprecations page (shutdowns 2026-10-23)
claude-fable-5 nativeStructuredOutputs: true — documented GA; previously fell back to prompt-injected schemas
Temperature ranges corrected against API references: xAI 0–2 (was 0–1), Mistral 0–1.5 (chat-completions max per OpenAPI spec; new 1.5 slider variant in the agent block + parameterized getModelsWithTemperatureRange), Together 0–1 (their docs), Groq provider-level 0–2, deepseek-chat/Cerebras temperature exposed
xAI retired slugs repriced to their redirect targets' billed rates (retirement page states redirect billing; legacy values overestimated live cost up to 6x)
maxOutputTokens filled in for Groq (verified via Groq's live models API), Cerebras, Vertex (was silently falling back to 4096); recommended/speedOptimized flags normalized across providers; ollama-cloud no longer advertises unsupported tool_choice

Justification logs (one per provider, maintained locally): per-model field tables with the verifying source URL and verdict, changes applied, changes deliberately not made (with reasons), and unverifiable items. Conflicting agent findings were re-verified directly against the primary source before deciding (e.g., gemini-3.1-flash-lite minimal thinking support — confirmed supported and default).

Known follow-ups (documented, intentionally not in this PR)

Wire reasoning_effort for xAI/Cerebras/Magistral and prompt_cache_key for Mistral, then add the corresponding capability flags
Add new SKUs: deepseek-v4-flash/pro (aliases retire 2026-07-24), grok-build-0.1, mistral-medium-3-5, newer Azure/Foundry models, Gemini deep-research 04-2026 SKUs
Vertex 2.5 family retires 2026-10-16 — revisit deprecation + vertex defaultModel ~Sept 2026
Cohere-on-Bedrock prices unverifiable via the Pricing API (match list prices; both models Legacy/deprecated)

Type of Change

Bug fix

Testing

All provider, block, and landing-catalog tests pass (151 + 83 in affected files; assertions updated only where they encoded the old incorrect values). New unit tests for the bedrock geo-profile logic. Typecheck and lint clean. Pre-existing providers/ test failures on staging are unrelated (verified by stash-run on clean tree).

Checklist

Code follows project style guidelines
Self-reviewed my changes
Tests added/updated and passing
No new warnings introduced
I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

…s model catalog

vercel · 2026-06-12T02:19:39Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	Jun 12, 2026 3:11am

cursor · 2026-06-12T02:19:42Z

PR Summary

Medium Risk
Large metadata changes affect model selection, sliders, and cost display; the Bedrock ID fix is correctness-critical for those models but low blast radius elsewhere.

Overview
This PR re-validates the static model catalog in models.ts and fixes a Bedrock runtime bug where geo inference profile prefixes were applied to models that only accept bare in-region IDs (Mistral, Cohere, Titan).

Catalog & capabilities: Pricing, updatedAt, context windows, maxOutputTokens, deprecation flags, and defaults are corrected across OpenAI, Anthropic, Azure, Gemini/Vertex, xAI, DeepSeek, Groq, Cerebras, Mistral, and Bedrock. Notable capability tweaks include Mistral temperature 0–1.5 (new agent-block slider), xAI/Groq 0–2, Together 0–1, claude-fable-5 native structured outputs, gpt-5.5-pro effort/verbosity aligned to docs, Azure 5.4 reasoning effort trimmed, and ollama-cloud no longer advertising unsupported tool choice.

Provider plumbing: getModelsWithTempRange01/02 is replaced by getModelsWithTemperatureRange(max) and MODELS_TEMP_RANGE_0_15; tests and landing-catalog assertions are updated to match.

^{Reviewed by Cursor Bugbot for commit ddadc30. Configure here.}

greptile-apps · 2026-06-12T02:26:43Z

Greptile Summary

This PR performs a two-pass validation of all ~160 models across 12 providers, correcting pricing, deprecation flags, context windows, temperature ranges, and capability fields against live provider documentation. It also ships a real runtime bug fix in getBedrockInferenceProfileId, which previously prefixed 11 Bedrock models (Mistral text, Cohere, Titan) with geo-inference profile IDs that those models don't support.

Bedrock geo-profile fix: GEO_PROFILE_UNSUPPORTED_MODEL_IDS set added; affected models now return bare in-region IDs. New unit tests cover the three branches (prefix required, already prefixed, unsupported).
Temperature range parameterization: getModelsWithTempRange01/02 replaced by getModelsWithTemperatureRange(max); a third slider entry (0–1.5) added in agent.ts for Mistral models; MODELS_TEMP_RANGE_0_15 exported from utils.ts.
Data corrections: ~30 models flagged deprecated; pricing updated for DeepSeek V4, grok-4.20 trio, Gemini Deep Research, Bedrock Mistral Large 3, Nova, and many others; context windows, maxOutputTokens, and reasoning-effort value sets corrected across Azure, Anthropic, xAI, Groq, and Vertex.

Confidence Score: 5/5

Safe to merge — the only runtime change is the Bedrock geo-profile fix, which is well-tested and corrects real invalid model IDs that were being sent to AWS.

The Bedrock bug fix is correct and covered by new unit tests. All data changes (pricing, deprecations, context windows) are documented in the PR with source verification. The rest of the changes are refactors and test updates that don't affect production paths.

apps/sim/providers/models.ts — the static temperature-range list helpers don't inherit provider-level capabilities, leaving an inconsistency with the per-model query functions, though this has no current production effect.

Important Files Changed

Filename	Overview
apps/sim/providers/models.ts	Large data-correction sweep: pricing, deprecation flags, context window, temperature ranges, and capabilities updated for ~160 models across 12 providers; two functions replaced by a single parameterized getModelsWithTemperatureRange(max).
apps/sim/providers/bedrock/utils.ts	Bug fix: adds GEO_PROFILE_UNSUPPORTED_MODEL_IDS set to prevent getBedrockInferenceProfileId from prepending geo-inference prefixes to the 11 models that only accept bare model IDs.
apps/sim/blocks/blocks/agent.ts	Adds a third temperature slider entry (min 0, max 1.5) for Mistral models, matching the pre-existing pattern of two sliders with mutually exclusive model-filtered conditions.
apps/sim/providers/utils.ts	Switches from two named range exports to three parameterized ones adding MODELS_TEMP_RANGE_0_15; imports updated to match the renamed function.
apps/sim/providers/bedrock/utils.test.ts	New test file: covers geo-inference profile prefixing, already-prefixed IDs, and the 11 models that must return bare model IDs.
apps/sim/providers/utils.test.ts	Tests updated to reflect all temperature-range, deprecation, effort-value, and max-token changes; new MODELS_TEMP_RANGE_0_15 range assertions added.
apps/sim/app/(landing)/models/utils.test.ts	Single test line updated: replaces deprecated grok-4-latest with mistral-medium-latest in the best-for copy differentiation test.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Model ID lookup] --> B{getModelCapabilities}
    B --> C["Merge: { ...provider.capabilities, ...model.capabilities }"]
    C --> D[supportsTemperature / getMaxTemperature]
    D --> E[Agent slider condition — CORRECT for all models incl. Groq]

    F[Static list builders] --> G{getModelsWithTemperatureRange / getModelsWithTemperatureSupport}
    G --> H[Only model.capabilities.temperature checked]
    H --> I[MODELS_TEMP_RANGE_0_2 / MODELS_WITH_TEMPERATURE_SUPPORT — MISSES Groq provider-level caps]

    J[Bedrock model ID] --> K{getBedrockInferenceProfileId}
    K --> L{Already prefixed?}
    L -- yes --> M[Return as-is]
    L -- no --> N{In GEO_PROFILE_UNSUPPORTED_MODEL_IDS?}
    N -- yes --> O[Return bare model ID]
    N -- no --> P[Add region prefix e.g. us. eu. ap.]

_{Reviews (5): Last reviewed commit: "fix(providers): default azure-openai to ..." | Re-trigger Greptile}

…th per-provider justification docs

waleedlatif1 · 2026-06-12T03:04:38Z

Re: the two items flagged in the Greptile summary (posted as outside-diff comments, so answering here):

1. DeepSeek cachedInput $0.0028 (50x discount) — verified, not a decimal error. Re-fetched https://api-docs.deepseek.com/quick_start/pricing directly: deepseek-v4-flash is $0.0028/1M on cache hit vs $0.14/1M on cache miss, $0.28/1M output (and v4-pro is $0.003625 hit vs $0.435 miss — an even steeper ~120x). DeepSeek moved away from the old 10x ratio with the V4 generation. The deepseek-chat/deepseek-reasoner aliases bill at v4-flash rates until their 2026-07-24 retirement.

2. Renamed IDs (gemini-3.1-flash-lite-preview → gemini-3.1-flash-lite, azure/gpt-5-chat-latest → azure/gpt-5-chat) — runtime routing does not fail for persisted workflows. getProviderFromModel falls back to provider modelPatterns (/^gemini/, /^vertex\//, /^azure\//) when an id isn't in the catalog, and the Azure deployment name is derived from the stored model string (request.model.replace('azure/', '')), so a workflow that stored an old id routes and executes exactly as before — it only loses catalog pricing/capability metadata. Aliases weren't kept because each old id is dead or dying upstream: the google preview was shut down server-side 2026-05-25 (calls fail regardless of our catalog), the vertex preview alias is discontinued by Google on 2026-07-09, and gpt-5-chat-latest never matched an Azure model name. Keeping deprecated alias entries would add permanent catalog noise for at most a few weeks of metadata continuity — judged not worth it, but happy to add them if you disagree.

… gpt-4o

waleedlatif1 · 2026-06-12T03:32:33Z

@greptile

waleedlatif1 · 2026-06-12T03:32:34Z

@cursor review

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit ddadc30. Configure here.}

fix(providers): correct pricing, deprecations, and capabilities acros…

768ef94

…s model catalog

greptile-apps Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread apps/sim/providers/models.ts

fix(providers): apply full re-validation pass across model catalog wi…

47e8baf

…th per-provider justification docs

cursor Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread apps/sim/providers/models.ts

vercel Bot deployed to Preview June 12, 2026 03:00 View deployment

chore(providers): keep model validation logs local, not in the repo

025f84b

cursor Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread apps/sim/providers/models.ts

fix(providers): default azure-openai to gpt-5.4 instead of deprecated…

ddadc30

… gpt-4o

vercel Bot temporarily deployed to Preview June 12, 2026 03:10 Inactive

cursor Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread apps/sim/providers/models.ts

cursor Bot reviewed Jun 12, 2026

View reviewed changes

waleedlatif1 mentioned this pull request Jun 12, 2026

v0.7.4: round-robin byok support, table block fix, db read replica routing, trigger.dev, temporal, latex, quartr, brex, convex integrations #4978

Merged

waleedlatif1 merged commit e1af2bf into staging Jun 12, 2026
15 checks passed

waleedlatif1 deleted the fix/model-validation-sweep branch June 12, 2026 03:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(providers): correct pricing, deprecations, and capabilities across model catalog#4990

fix(providers): correct pricing, deprecations, and capabilities across model catalog#4990
waleedlatif1 merged 4 commits into
stagingfrom
fix/model-validation-sweep

waleedlatif1 commented Jun 12, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

cursor Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

waleedlatif1 commented Jun 12, 2026

Uh oh!

Uh oh!

Uh oh!

waleedlatif1 commented Jun 12, 2026

Uh oh!

waleedlatif1 commented Jun 12, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

waleedlatif1 commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

Testing

Checklist

Uh oh!

vercel Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

greptile-apps Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

waleedlatif1 commented Jun 12, 2026

Uh oh!

Uh oh!

Uh oh!

waleedlatif1 commented Jun 12, 2026

Uh oh!

waleedlatif1 commented Jun 12, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

waleedlatif1 commented Jun 12, 2026 •

edited

Loading

vercel Bot commented Jun 12, 2026 •

edited

Loading

cursor Bot commented Jun 12, 2026 •

edited

Loading

greptile-apps Bot commented Jun 12, 2026 •

edited

Loading