- Categories:
Table functions (Cortex Agents)
GET_AI_EVALUATION_DATA (SNOWFLAKE.LOCAL)¶
Retrieves evaluation data for a run for a Cortex Agent or for an External Agent application (see External Agent commands).
Call this function to inspect all recorded traces for an evaluation run. For more information on Cortex Agent evaluations, see Cortex Agent evaluations. For AI Observability applications, see Observability data.
- See also:
EXECUTE_AI_EVALUATION , GET_AI_RECORD_TRACE (SNOWFLAKE.LOCAL) , GET_AI_OBSERVABILITY_LOGS (SNOWFLAKE.LOCAL) , GET_AI_OBSERVABILITY_EVENTS (SNOWFLAKE.LOCAL)
Syntax¶
Arguments¶
databaseName of the database containing the agent.
schemaName of the schema containing the agent.
agent_nameName of the agent to retrieve a record for.
agent_typeThe agent type string. Use
CORTEX AGENTfor a Cortex Agent orEXTERNAL AGENTfor an External Agent object. This value is case-insensitive.run_nameName of the run to retrieve full evaluation data for.
Returns¶
A table containing information for the specified evaluation, with the following columns:
Column |
Data type |
Description |
|---|---|---|
RECORD_ID |
VARCHAR |
The unique identifier assigned by Snowflake for this evaluation record. |
INPUT_ID |
VARCHAR |
The unique identifier assigned by Snowflake for this evaluation input. |
REQUEST_ID |
VARCHAR |
The unique identifier assigned by Snowflake for this request. |
TIMESTAMP |
TIMESTAMP_TZ |
The time (in UTC) at which the request was made. |
DURATION_MS |
INT |
The amount of time, in milliseconds, that it took for the agent to return a response. |
INPUT |
VARCHAR |
The query string used as input for this evaluation record. |
OUTPUT |
VARCHAR |
The response returned by the Cortex Agent for this evaluation record. |
ERROR |
VARCHAR |
Information about any errors that occurred during the request. |
GROUND_TRUTH |
VARCHAR |
The ground truth information used to evaluate this record’s Cortex Agent output. This column holds the JSON from your dataset’s ground truth column, serialized as a string. For how |
METRIC_NAME |
VARCHAR |
The name of the metric evaluated for this record. |
EVAL_AGG_SCORE |
NUMBER |
The evaluation score assigned for this record. |
METRIC_TYPE |
VARCHAR |
The type of metric being evaluated. For built-in metrics, the value is |
METRIC_STATUS |
VARIANT |
A map containing information about the agent’s HTTP response for this record, with the following keys:
|
METRIC_CALLS |
ARRAY |
An array of VARIANT values that contain information about the computed metric. Each array entry contains the metric’s criteria, an explanation of the metric score, and metadata. The keys of each entry are:
|
TOTAL_INPUT_TOKENS |
INT |
The total number of tokens used to process the input query. |
TOTAL_OUTPUT_TOKENS |
INT |
The total number of output tokens produced by the Cortex Agent. |
LLM_CALL_COUNT |
INT |
Counts the number of times any LLM was called, either by the agent or an evaluation judge. |
Access control requirements¶
A role used to execute this operation must have the following privileges at a minimum:
Privilege |
Object |
Notes |
|---|---|---|
CORTEX_USER |
Database role |
|
USAGE |
Cortex Agent or External Agent |
Required on the object identified by |
MONITOR |
Cortex Agent |
Required on the Cortex Agent identified by |
Operating on an object in a schema requires at least one privilege on the parent database and at least one privilege on the parent schema.
For instructions on creating a custom role with a specified set of privileges, see Creating custom roles.
For general information about roles and privilege grants for performing SQL actions on securable objects, see Overview of Access Control.
When agent_type is EXTERNAL AGENT, only USAGE on that object is required to call this function. OWNERSHIP on the External Agent is required to modify or remove the object with ALTER EXTERNAL AGENT or DROP EXTERNAL AGENT.
For the full access control permissions required by Cortex Agent evaluations, see Cortex Agent evaluations – Access control requirements. For External Agent objects, see Observability data.
Examples¶
The following example displays the full evaluation details for a run called run-1, where the agent is named evaluated_agent stored on the schema eval_db.eval_schema: