| Field | Type | Required | Description | Example |
|---|---|---|---|---|
max_tokens |
OptionalNullable[int] | ➖ | The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length. |
|
stream |
Optional[bool] | ➖ | Whether to stream back partial progress. If set, tokens will be sent as data-only server-side events as they become available, with the stream terminated by a data: [DONE] message. Otherwise, the server will hold the request open until the timeout or until completion, with the response containing the full result as JSON. | |
stop |
OptionalNullable[models.AgentsCompletionRequestStop] | ➖ | Stop generation if this token is detected. Or if one of these tokens is detected when providing an array | |
random_seed |
OptionalNullable[int] | ➖ | The seed to use for random sampling. If set, different calls will generate deterministic results. | |
metadata |
Dict[str, Any] | ➖ | N/A | |
messages |
List[models.AgentsCompletionRequestMessage] | ✔️ | The prompt(s) to generate completions for, encoded as a list of dict with role and content. | [ { "role": "user", "content": "Who is the best French painter? Answer in one short sentence." } ] |
response_format |
Optional[models.ResponseFormat] | ➖ | Specify the format that the model must output. By default it will use { "type": "text" }. Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is in JSON. When using JSON mode you MUST also instruct the model to produce JSON yourself with a system or a user message. Setting to { "type": "json_schema" } enables JSON schema mode, which guarantees the message the model generates is in JSON and follows the schema you provide. |
Example 1: { "type": "text" } Example 2: { "type": "json_object" } Example 3: { "type": "json_schema", "json_schema": { "schema": { "properties": { "name": { "title": "Name", "type": "string" }, "authors": { "items": { "type": "string" }, "title": "Authors", "type": "array" } }, "required": [ "name", "authors" ], "title": "Book", "type": "object", "additionalProperties": false }, "name": "book", "strict": true } } |
tools |
List[models.AgentsCompletionRequestTool] | ➖ | N/A | |
tool_choice |
Optional[models.AgentsCompletionRequestToolChoice] | ➖ | N/A | |
presence_penalty |
OptionalNullable[float] | ➖ | The presence_penalty determines how much the model penalizes the repetition of words or phrases. A higher presence penalty encourages the model to use a wider variety of words and phrases, making the output more diverse and creative. |
|
frequency_penalty |
OptionalNullable[float] | ➖ | The frequency_penalty penalizes the repetition of words based on their frequency in the generated text. A higher frequency penalty discourages the model from repeating words that have already appeared frequently in the output, promoting diversity and reducing repetition. |
|
n |
OptionalNullable[int] | ➖ | Number of completions to return for each request, input tokens are only billed once. | |
prediction |
Optional[models.Prediction] | ➖ | Enable users to specify an expected completion, optimizing response times by leveraging known or predictable content. | |
parallel_tool_calls |
Optional[bool] | ➖ | N/A | |
reasoning_effort |
OptionalNullable[models.ReasoningEffort] | ➖ | N/A | |
prompt_mode |
OptionalNullable[models.MistralPromptMode] | ➖ | Allows toggling between the reasoning mode and no system prompt. When set to reasoning the system prompt for reasoning models will be used. |
|
guardrails |
List[models.GuardrailConfig] | ➖ | N/A | |
prompt_cache_key |
OptionalNullable[str] | ➖ | N/A | |
agent_id |
str | ✔️ | The ID of the agent to use for this completion. |