feat(providers): add Sakana AI provider with Fugu models#5169
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview The new
Separately, Reviewed by Cursor Bugbot for commit ee72a69. Configure here. |
Greptile SummaryThis PR integrates Sakana AI as a new BYOK-only LLM provider, wiring up the
Confidence Score: 5/5Safe to merge — all registration points are consistent, the executor follows the established OpenAI-compatible pattern without deviation, and the BYOK-only constraint is correctly enforced. The change is a self-contained provider addition that mirrors the deepseek/groq/cerebras integration pattern. Every required wiring point (ProviderId, registry, model definitions, tokenization, attachment adapter, metadata map, icon) has been updated in lockstep with no gaps. The tool-execution loop, streaming paths, forced-tool-choice tracking, and cost accumulation are structurally identical to peer providers. No regressions were identified in the existing test suite or the newly added provider tests. No files require special attention. Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant Client
participant SakanaProvider
participant OpenAI_SDK as OpenAI SDK (baseURL: api.sakana.ai/v1)
participant Tools
Client->>SakanaProvider: executeRequest(request)
alt No tools / streaming only
SakanaProvider->>OpenAI_SDK: "chat.completions.create({stream:true})"
OpenAI_SDK-->>SakanaProvider: AsyncIterable ChatCompletionChunk
SakanaProvider-->>Client: StreamingExecution
else Tools active (non-streaming tool loop)
SakanaProvider->>OpenAI_SDK: chat.completions.create(payload)
OpenAI_SDK-->>SakanaProvider: response (tool_calls)
loop up to MAX_TOOL_ITERATIONS
SakanaProvider->>Tools: executeTool(name, params)
Tools-->>SakanaProvider: result
SakanaProvider->>OpenAI_SDK: chat.completions.create(nextPayload + tool messages)
OpenAI_SDK-->>SakanaProvider: response
end
alt request.stream
SakanaProvider->>OpenAI_SDK: "chat.completions.create({tool_choice:none, stream:true})"
OpenAI_SDK-->>SakanaProvider: AsyncIterable ChatCompletionChunk
SakanaProvider-->>Client: StreamingExecution (accumulated cost)
else deferResponseFormat
SakanaProvider->>OpenAI_SDK: "chat.completions.create({response_format, tool_choice:none})"
OpenAI_SDK-->>SakanaProvider: structured JSON response
SakanaProvider-->>Client: ProviderResponse
else plain
SakanaProvider-->>Client: ProviderResponse
end
end
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant Client
participant SakanaProvider
participant OpenAI_SDK as OpenAI SDK (baseURL: api.sakana.ai/v1)
participant Tools
Client->>SakanaProvider: executeRequest(request)
alt No tools / streaming only
SakanaProvider->>OpenAI_SDK: "chat.completions.create({stream:true})"
OpenAI_SDK-->>SakanaProvider: AsyncIterable ChatCompletionChunk
SakanaProvider-->>Client: StreamingExecution
else Tools active (non-streaming tool loop)
SakanaProvider->>OpenAI_SDK: chat.completions.create(payload)
OpenAI_SDK-->>SakanaProvider: response (tool_calls)
loop up to MAX_TOOL_ITERATIONS
SakanaProvider->>Tools: executeTool(name, params)
Tools-->>SakanaProvider: result
SakanaProvider->>OpenAI_SDK: chat.completions.create(nextPayload + tool messages)
OpenAI_SDK-->>SakanaProvider: response
end
alt request.stream
SakanaProvider->>OpenAI_SDK: "chat.completions.create({tool_choice:none, stream:true})"
OpenAI_SDK-->>SakanaProvider: AsyncIterable ChatCompletionChunk
SakanaProvider-->>Client: StreamingExecution (accumulated cost)
else deferResponseFormat
SakanaProvider->>OpenAI_SDK: "chat.completions.create({response_format, tool_choice:none})"
OpenAI_SDK-->>SakanaProvider: structured JSON response
SakanaProvider-->>Client: ProviderResponse
else plain
SakanaProvider-->>Client: ProviderResponse
end
end
Reviews (3): Last reviewed commit: "test(session): de-flake SessionProvider ..." | Re-trigger Greptile |
|
@greptile review |
|
@cursor review |
|
@greptile review |
|
@cursor review |
|
@greptile review |
|
@cursor review |
|
@greptile review |
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit de13c78. Configure here.
|
@greptile review |
OpenAI-compatible provider at https://api.sakana.ai/v1 (bearer auth). Registers fugu (fast default) and fugu-ultra (reasoning flagship), both 1M context. BYOK-only, never hosted/auto-billed. Streaming, tool loop, and response_format supported; attachments mirror deepseek (unsupported in the current adapter).
OpenAI-compatible backends reject a request carrying both response_format and active tools/tool_choice. Mirror the LiteLLM pattern: withhold the JSON schema while tools are active and apply it on a final tool-free call (tool_choice: none) for both streaming and non-streaming paths.
- Rethrow tool-loop failures instead of swallowing them, so a failed run surfaces as a ProviderError rather than a partial success (matches LiteLLM). - Force tool_choice: 'none' on the post-tool streaming pass so the model cannot emit fresh tool calls that the text-only stream adapter would drop.
- Pass stream_options: { include_usage: true } on both streaming calls so
token/cost data is captured (the shared OpenAI-compatible stream helper
only fills usage from chunk usage, which the API omits without the flag).
- Include !hasActiveTools in the early-stream guard so requests whose tools
are all filtered out (e.g. usageControl 'none') still take the fast
streaming path instead of the tool-loop path. Mirrors LiteLLM.
… valid An assistant message lists all tool_calls, so a call for an unconfigured tool must still get a matching `tool` response or the next request violates the OpenAI message contract. Emit an error tool-result for unknown tools instead of dropping them.
flush() only drained microtasks, so the query->render update occasionally lost the race and ctx.data was still null after the flush budget. Yield one macrotask tick per flush so React Query's notifyManager and deferred renders settle deterministically. Verified across repeated local runs.
de13c78 to
ee72a69
Compare
|
Rebased onto latest |
|
@greptile review |
|
@cursor review |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ee72a69. Configure here.

Summary
https://api.sakana.ai/v1, bearer auth)fugu(fast default) andfugu-ultra(reasoning flagship), both 1M context windowProviderId, registry,PROVIDER_DEFINITIONS, providers metadata,SakanaIcon, tokenization config, attachment adapterresponse_format— mirrors the deepseek OpenAI-compatible patterngetHostedModels(), so it is never served with Sim's hosted key or auto-billed (matches groq/cerebras/deepseek/xai)Notes on pricing
fugu-ultrapricing ($5 input / $30 output / $0.50 cached per 1M) is taken directly from Sakana's pricing page (standard <272K tier)fugurouter ("a single rate based on the top-tier model involved"), sofuguis priced at the documentedfugu-ultraceiling so cost tracking never under-reportsType of Change
Testing
bun run lint, typecheck, andbun run check:api-validationpassmodels.test.ts): model ids, 1M context, pricing, andfugu/fugu-ultra→sakanarouting — 32 provider tests passhttps://api.sakana.ai/v1(live/v1/modelsreturns fugu/fugu-ultra); a full chat completion is pending an account with active creditChecklist