-
Notifications
You must be signed in to change notification settings - Fork 514
AI analytics #1339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
aadesh18
wants to merge
99
commits into
dev
Choose a base branch
from
ai-analytics
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
AI analytics #1339
Changes from 1 commit
Commits
Show all changes
99 commits
Select commit
Hold shift + click to select a range
66af47d
Enhance documentation tools integration
mantrakp04 30d53e8
Merge branch 'dev' into dario-likes-mcps
mantrakp04 5078747
Enhance error handling and API response for documentation tools
mantrakp04 aaf49db
Merge branch 'dev' into dario-likes-mcps
mantrakp04 844e916
Refactor askStackAuth key to ask_stack_auth in API documentation
mantrakp04 274c742
fix: register private submodule gitlink in the index
mantrakp04 c7a3cca
Merge branch 'dev' into dario-likes-mcps
mantrakp04 ef2289f
Merge branch 'dev' into dario-likes-mcps
mantrakp04 d8065c4
Update environment configurations and remove internal secret validati…
mantrakp04 3b27eee
Merge branch 'dev' into dario-likes-mcps
mantrakp04 b82efa4
Merge branch 'dev' into dario-likes-mcps
mantrakp04 158498b
Merge branch 'dev' into dario-likes-mcps
mantrakp04 b22d4b0
Merge branch 'dev' into dario-likes-mcps
mantrakp04 95ca0a2
initial commit
aadesh18 fbab066
Merge remote-tracking branch 'origin/dario-likes-mcps' into llm-mcp-flow
aadesh18 73152a1
pnpm lock
aadesh18 e16040c
changed port
aadesh18 a07dbab
spacetime db ci change
aadesh18 ef77edc
ci fix
aadesh18 84dffa2
security fix
aadesh18 a0486e9
security fixes
aadesh18 8c596ec
Merge branch 'dev' into dario-likes-mcps
mantrakp04 ef6963d
Merge branch 'dev' into dario-likes-mcps
N2D4 1c69185
Merge branch 'dario-likes-mcps' into llm-mcp-flow
aadesh18 f794bd6
Merge remote-tracking branch 'origin/dev' into llm-mcp-flow
aadesh18 59a060a
merge error
aadesh18 0485c73
pr comment changes
aadesh18 97ee052
Merge branch 'dev' into llm-mcp-flow
aadesh18 411f775
bug fix
aadesh18 c514efd
Merge branch 'llm-mcp-flow' of https://github.com/stack-auth/stack-au…
aadesh18 516c424
Merge branch 'dev' into llm-mcp-flow
aadesh18 b0e3341
pr comments
aadesh18 a630be1
Merge branch 'llm-mcp-flow' of https://github.com/stack-auth/stack-au…
aadesh18 8c7bc54
tests failing
aadesh18 7a54be9
comment changes
aadesh18 bd3925d
Merge branch 'dev' into llm-mcp-flow
aadesh18 ca461d4
tests fix
aadesh18 224468c
Merge branch 'llm-mcp-flow' of https://github.com/stack-auth/stack-au…
aadesh18 042e616
tests fix
aadesh18 149d6d7
fixed the order
aadesh18 574cc4a
Merge branch 'dev' into llm-mcp-flow
aadesh18 3293845
Merge branch 'dev' into llm-mcp-flow
aadesh18 d8e99d6
Merge branch 'dev' into llm-mcp-flow
aadesh18 fa4c814
Merge branch 'dev' into llm-mcp-flow
aadesh18 a4c3306
pr changes
aadesh18 35739af
Merge branch 'dev' into llm-mcp-flow
aadesh18 15e5879
Merge remote-tracking branch 'origin/dev' into llm-mcp-flow
aadesh18 140ee7e
Merge branch 'dev' into llm-mcp-flow
aadesh18 afd84bc
minor fix
aadesh18 b0a329f
initial commit
aadesh18 c819537
proxy logging implemented
aadesh18 7a2332f
Merge remote-tracking branch 'origin/dev' into ai-analytics
aadesh18 83a37d1
pr message fixes
aadesh18 4fb5154
internal tool security update
aadesh18 30e3e5c
Merge branch 'dev' into ai-analytics
aadesh18 edd33b1
added e2e tests
aadesh18 a43eb11
bot comment
aadesh18 1ccef9c
Update seed function to preserve existing user metadata when updating…
aadesh18 4965534
refactor: replace callReducer with callReducerStrict for improved err…
aadesh18 9ba7b5e
clean up
aadesh18 ddde9c6
feat: implement timeout for SpacetimeDB HTTP calls to prevent hanging…
aadesh18 c329a46
fix: improve error handling for missing SpacetimeDB service token in …
aadesh18 26ce83f
fix: encode URI components in fetch requests to prevent errors with s…
aadesh18 f9386a8
bot fixes
aadesh18 dc5ab66
fix: add log token retrieval in getServiceToken function
aadesh18 2532632
fix: enhance error handling in isSpacetimedbReachable and update priv…
aadesh18 53a9f2c
fix: update footer separator in ConversationReplay component for impr…
aadesh18 0eff6b2
bug fix
aadesh18 170b4fe
fix: refactor MCP review authorization and improve logging mechanisms
aadesh18 a0bab5d
tests clean up
aadesh18 3654af5
Merge remote-tracking branch 'origin/dev' into ai-analytics
aadesh18 dbc7988
bug fix
aadesh18 d8b7499
Custom Dashboard Improvements (#1359)
aadesh18 331d208
Update backend environment variables and refactor AI query route imports
aadesh18 49d2c04
Enhance image attachment validation in AI query route
aadesh18 60c538b
Add context to system prompt in AI query route
aadesh18 0aea8ef
edited comment
aadesh18 39facf4
added comment
aadesh18 45ff5f2
aman comment changes
aadesh18 89d43b1
Merge remote-tracking branch 'origin/dev' into ai-analytics
aadesh18 971bad9
merge changes
aadesh18 16542f1
removed mcpCorrelationId
aadesh18 10e6cfc
Enhance qaId validation in MCP review routes to ensure it is a non-ne…
aadesh18 5552bd7
Refactor reviewer handling in MCP review route to improve readability…
aadesh18 2e78347
Implement update_qa_entry_with_publish reducer to streamline QA entry…
aadesh18 0cedc49
Refactor AI query logging to handle serialization errors gracefully a…
aadesh18 a087f6b
Refactor AI query logging to encapsulate error handling within async …
aadesh18 7e850db
Update tool call instructions in AI prompts to specify usage of patch…
aadesh18 7527bcf
Enhance occurrenceIndex validation in applyDashboardPatches to ensure…
aadesh18 cfcedeb
Refactor applyDashboardPatches to use 'draft' instead of 'running' fo…
aadesh18 9500e1e
sizing fix
aadesh18 9599254
Merge branch 'dev' into ai-analytics
aadesh18 c664a1d
Merge remote-tracking branch 'origin/dev' into ai-analytics
aadesh18 1389d37
lint fix
aadesh18 185d245
update reviewer authentication and API calls
aadesh18 29a6a88
Merge remote-tracking branch 'origin/dev' into ai-analytics
aadesh18 d63bb9c
test fix
aadesh18 6b8838e
Merge branch 'dev' into ai-analytics
aadesh18 26f0ff6
bot comments
aadesh18 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
initial commit
- Loading branch information
commit 95ca0a29618677633a24c83724e52b4eacdee8b6
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| import { getEnvVariable } from "@stackframe/stack-shared/dist/utils/env"; | ||
| import { captureError } from "@stackframe/stack-shared/dist/utils/errors"; | ||
| import { DbConnection } from "./spacetimedb-bindings"; | ||
| import type { LogMcpCallParams } from "./spacetimedb-bindings/types/reducers"; | ||
|
|
||
| export type McpLogEntry = Omit<LogMcpCallParams, "token">; | ||
|
|
||
| let connectionPromise: Promise<DbConnection> | null = null; | ||
|
|
||
| export async function getConnection(): Promise<DbConnection | null> { | ||
| const uri = getEnvVariable("STACK_SPACETIMEDB_URI", ""); | ||
| if (!uri) { | ||
| return null; | ||
| } | ||
|
|
||
| if (!connectionPromise) { | ||
| connectionPromise = new Promise<DbConnection>((resolve, reject) => { | ||
| DbConnection.builder() | ||
| .withUri(uri) | ||
| .withDatabaseName(getEnvVariable("STACK_SPACETIMEDB_DB_NAME")) | ||
| .onConnect((connInstance) => { | ||
| connInstance.subscriptionBuilder() | ||
| .onApplied(() => { | ||
| resolve(connInstance); | ||
| }) | ||
| .subscribe("SELECT * FROM mcp_call_log"); | ||
| }) | ||
| .onConnectError((_: unknown, err: Error) => { | ||
| captureError("mcp-logger", err); | ||
| connectionPromise = null; | ||
| reject(err); | ||
| }) | ||
| .build(); | ||
| }); | ||
| } | ||
|
|
||
| return await connectionPromise; | ||
| } | ||
|
|
||
| export async function logMcpCall(entry: McpLogEntry): Promise<void> { | ||
| const conn = await getConnection(); | ||
| if (!conn) { | ||
| return; | ||
| } | ||
|
|
||
| const token = getEnvVariable("STACK_MCP_LOG_TOKEN"); | ||
| await conn.reducers.logMcpCall({ | ||
| token, | ||
| ...entry, | ||
| }); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,172 @@ | ||
| import { createMCPClient } from "@ai-sdk/mcp"; | ||
| import { getEnvVariable } from "@stackframe/stack-shared/dist/utils/env"; | ||
| import { captureError } from "@stackframe/stack-shared/dist/utils/errors"; | ||
| import { generateText, stepCountIs } from "ai"; | ||
| import { getConnection } from "./mcp-logger"; | ||
| import { createOpenRouterProvider } from "./models"; | ||
| import { getVerifiedQaContext } from "./verified-qa"; | ||
|
|
||
| const QA_SYSTEM_PROMPT = `You are a QA reviewer for Stack Auth's AI documentation assistant. | ||
| You will receive a question, the agent's stated reason for asking, and the AI's response. | ||
|
|
||
| Your tasks: | ||
| 1. RELEVANCE: Does the response actually answer the question? Does the stated reason align with what was asked? | ||
| 2. CORRECTNESS: Verify factual claims about Stack Auth. Use human-verified Q&A (appended below, if any) as the highest-priority source of truth — these are always correct. Then use the available tools to look up additional information from the Stack Auth codebase. If the AI response contradicts a human-verified answer, flag it as incorrect. | ||
|
|
||
| The repo name for all tool calls is "stack-auth/stack-auth". Only use the repository documentation tools (read_wiki_structure, read_wiki_contents, ask_question) — do not create sessions or modify any other resources. | ||
|
|
||
| You MUST respond with ONLY valid JSON matching this exact schema (no markdown, no explanation outside the JSON): | ||
| { | ||
| "needsHumanReview": boolean, | ||
| "answerCorrect": boolean, | ||
| "answerRelevant": boolean, | ||
| "flags": [{"type": string, "severity": "low" | "medium" | "high" | "critical", "explanation": string}], | ||
| "improvementSuggestions": string, | ||
| "overallScore": number | ||
| } | ||
|
|
||
| Flag types: "factual_error", "incomplete_answer", "off_topic", "hallucination", "outdated_info", "missing_context", "misleading", "reason_mismatch" | ||
|
|
||
| Scoring: | ||
| - 90-100: Excellent — factually correct, fully addresses the question | ||
| - 70-89: Good — minor issues or missing details | ||
| - 50-69: Acceptable — notable issues but core answer is present | ||
| - 30-49: Poor — significant problems | ||
| - 0-29: Unacceptable — fundamentally wrong or irrelevant | ||
|
|
||
| Set needsHumanReview=true if: score < 50, any critical flag, or you are uncertain about correctness.`; | ||
|
|
||
| const REVIEW_MODEL_ID = "anthropic/claude-haiku-4.5"; | ||
|
|
||
| export async function reviewMcpCall(entry: { | ||
| correlationId: string; | ||
| question: string; | ||
| reason: string; | ||
| response: string; | ||
| }): Promise<void> { | ||
| const apiKey = getEnvVariable("STACK_OPENROUTER_API_KEY", ""); | ||
| if (!apiKey || apiKey === "FORWARD_TO_PRODUCTION") { | ||
| return; | ||
| } | ||
|
|
||
| let devinClient: Awaited<ReturnType<typeof createMCPClient>> | null = null; | ||
|
|
||
| const failureUpdate = (err: unknown) => ({ | ||
| qaNeedsHumanReview: true, | ||
| qaAnswerCorrect: false, | ||
| qaAnswerRelevant: false, | ||
| qaFlagsJson: "[]", | ||
| qaImprovementSuggestions: "", | ||
| qaOverallScore: 0, | ||
| qaConversationJson: undefined, | ||
| qaErrorMessage: String(err), | ||
| }); | ||
|
|
||
| let update: { | ||
| qaNeedsHumanReview: boolean, | ||
| qaAnswerCorrect: boolean, | ||
| qaAnswerRelevant: boolean, | ||
| qaFlagsJson: string, | ||
| qaImprovementSuggestions: string, | ||
| qaOverallScore: number, | ||
| qaConversationJson: string | undefined, | ||
| qaErrorMessage: string | undefined, | ||
| }; | ||
|
|
||
| try { | ||
| // Wait for the log row to be written first | ||
| await new Promise(r => setTimeout(r, 3000)); | ||
|
|
||
| devinClient = await createMCPClient({ | ||
| transport: { | ||
| type: "http", | ||
| url: "https://mcp.deepwiki.com/mcp", | ||
| }, | ||
| }); | ||
|
|
||
| const devinTools = await devinClient.tools(); | ||
| const openrouter = createOpenRouterProvider(); | ||
| const model = openrouter(REVIEW_MODEL_ID); | ||
|
|
||
| const userMessage = [ | ||
| "## Question", | ||
| entry.question, | ||
| "", | ||
| "## Agent's Reason for Asking", | ||
| entry.reason, | ||
| "", | ||
| "## AI Response", | ||
| entry.response, | ||
| ].join("\n"); | ||
|
|
||
| const verifiedQa = await getVerifiedQaContext(); | ||
|
|
||
| const result = await generateText({ | ||
| model, | ||
| system: QA_SYSTEM_PROMPT + verifiedQa, | ||
| tools: devinTools as Parameters<typeof generateText>[0]["tools"], | ||
| stopWhen: stepCountIs(10), | ||
| messages: [{ role: "user", content: userMessage }], | ||
| }); | ||
|
|
||
| const conversation = result.steps.map((step, i) => { | ||
| const toolCalls = step.toolCalls.map(tc => ({ toolName: tc.toolName, args: tc.input })); | ||
| const toolResults = step.toolResults.map(tr => ({ | ||
| toolName: tr.toolName, | ||
| toolCallId: tr.toolCallId, | ||
| result: tr.output, | ||
| })); | ||
| return { | ||
| step: i + 1, | ||
| text: step.text || undefined, | ||
| toolCalls: toolCalls.length > 0 ? toolCalls : undefined, | ||
| toolResults: toolResults.length > 0 ? toolResults : undefined, | ||
| }; | ||
| }); | ||
|
|
||
| const jsonMatch = result.text.match(/\{[\s\S]*\}/); | ||
| if (!jsonMatch) { | ||
| throw new Error("No JSON found in QA review response"); | ||
| } | ||
| const parsed = JSON.parse(jsonMatch[0]) as { | ||
| needsHumanReview: boolean, | ||
| answerCorrect: boolean, | ||
| answerRelevant: boolean, | ||
| flags: Array<{ type: string, severity: string, explanation: string }>, | ||
| improvementSuggestions: string, | ||
| overallScore: number, | ||
| }; | ||
|
|
||
| update = { | ||
| qaNeedsHumanReview: parsed.needsHumanReview, | ||
| qaAnswerCorrect: parsed.answerCorrect, | ||
| qaAnswerRelevant: parsed.answerRelevant, | ||
| qaFlagsJson: JSON.stringify(parsed.flags), | ||
| qaImprovementSuggestions: parsed.improvementSuggestions, | ||
| qaOverallScore: parsed.overallScore, | ||
| qaConversationJson: JSON.stringify(conversation), | ||
| qaErrorMessage: undefined, | ||
| }; | ||
| } catch (err) { | ||
| captureError("qa-reviewer", err instanceof Error ? err : new Error(String(err))); | ||
| update = failureUpdate(err); | ||
| } | ||
|
|
||
| if (devinClient) { | ||
| await devinClient.close().catch((err: unknown) => { | ||
| captureError("qa-reviewer", err instanceof Error ? err : new Error(String(err))); | ||
| }); | ||
| } | ||
|
|
||
| const conn = await getConnection(); | ||
| if (!conn) return; | ||
| const token = getEnvVariable("STACK_MCP_LOG_TOKEN"); | ||
| await conn.reducers.updateMcpQaReview({ | ||
| token, | ||
| correlationId: entry.correlationId, | ||
| qaReviewModelId: REVIEW_MODEL_ID, | ||
| ...update, | ||
| }).catch((err: unknown) => { | ||
| captureError("qa-reviewer", err instanceof Error ? err : new Error(String(err))); | ||
| }); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18 changes: 18 additions & 0 deletions
18
apps/backend/src/lib/ai/spacetimedb-bindings/add_manual_qa_reducer.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| // THIS FILE IS AUTOMATICALLY GENERATED BY SPACETIMEDB. EDITS TO THIS FILE | ||
| // WILL NOT BE SAVED. MODIFY TABLES IN YOUR MODULE SOURCE CODE INSTEAD. | ||
|
|
||
| /* eslint-disable */ | ||
| /* tslint:disable */ | ||
| import { | ||
| TypeBuilder as __TypeBuilder, | ||
| t as __t, | ||
| type AlgebraicTypeType as __AlgebraicTypeType, | ||
| type Infer as __Infer, | ||
| } from "spacetimedb"; | ||
|
|
||
| export default { | ||
| question: __t.string(), | ||
| answer: __t.string(), | ||
| publish: __t.bool(), | ||
| reviewedBy: __t.string(), | ||
| }; |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.