This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
GoSQLX is a production-ready, race-free, high-performance SQL parsing SDK for Go that provides lexing, parsing, and AST generation with zero-copy optimizations. The library is designed for enterprise use with comprehensive object pooling for memory efficiency.
Requirements: Go 1.26+ (upgraded from 1.23 to fix stdlib vulnerabilities; mark3labs/mcp-go requires 1.23)
Production Status: ✅ Validated for production deployment (v1.6.0+, current: v1.14.0)
- Thread-safe with zero race conditions (20,000+ concurrent operations tested)
- 1.38M+ ops/sec sustained, 1.5M peak with memory-efficient object pooling
- ~80-85% SQL-99 compliance (window functions, CTEs, set operations, MERGE, etc.)
- Multi-dialect support: PostgreSQL, MySQL, MariaDB, SQL Server, Oracle, SQLite, Snowflake, ClickHouse (8 dialects)
- Tokenizer (
pkg/sql/tokenizer/): Zero-copy SQL lexer with full UTF-8 support - Parser (
pkg/sql/parser/): Recursive descent parser with one-token lookahead - AST (
pkg/sql/ast/): Abstract Syntax Tree nodes with visitor pattern support - Keywords (
pkg/sql/keywords/): Multi-dialect SQL keyword definitions - Models (
pkg/models/): Core data structures (tokens, spans, locations) - Errors (
pkg/errors/): Structured error handling with position tracking - Metrics (
pkg/metrics/): Production performance monitoring - Security (
pkg/sql/security/): SQL injection detection with severity classification - Linter (
pkg/linter/): SQL linting engine with 30 built-in rules (L001-L030) - LSP (
pkg/lsp/): Language Server Protocol for IDE integration - GoSQLX (
pkg/gosqlx/): High-level simple API (recommended for most users) - Compatibility (
pkg/compatibility/): API stability testing
Raw SQL bytes → tokenizer.Tokenize() → []models.TokenWithSpan
→ parser.ParseFromModelTokens() → *ast.AST
The codebase uses extensive sync.Pool for all major data structures:
ast.NewAST()/ast.ReleaseAST()- AST containertokenizer.GetTokenizer()/tokenizer.PutTokenizer()- Tokenizer instances- Individual pools for SELECT, INSERT, UPDATE, DELETE statements
- Expression pools for identifiers, binary expressions, literals
Clean hierarchy with minimal coupling (verified against production imports):
# Core parsing chain
models → (no deps)
errors → models
metrics → (no deps)
keywords → (no deps)
token → (no deps)
tokenizer → models, errors, metrics, keywords
ast → models, metrics
parser → models, errors, keywords, token, tokenizer, ast
# Higher-level / product packages
formatter → models, sql/ast, sql/parser, sql/tokenizer
transform → formatter, sql/ast, sql/keywords, sql/parser, sql/tokenizer
fingerprint→ formatter, sql/ast, sql/parser, sql/tokenizer
security → sql/ast (scanner; tests also pull parser, tokenizer)
linter → sql/parser, sql/tokenizer
# rule sub-packages additionally import: linter, models, sql/ast
lsp → errors, models, gosqlx, sql/keywords, sql/parser, sql/tokenizer
cbinding → gosqlx, sql/ast (requires CGO; excluded from task test:race)
# High-level wrapper
gosqlx → all of the above (top-level convenience API)
Notes:
pkg/cbindingrequiresCGO_ENABLED=1. The Taskfile splits this out:task test:raceruns everything except cbinding, andtask test:cbindingruns cbinding with CGO on. CI workflows must follow the same split or cbinding is silently skipped.keywordshas no intra-module deps — it's a pure keyword table.astdepends onmodels(spans, locations) andmetrics(pool instrumentation), NOT ontokenin production code.
This project uses Task as the task runner:
go install github.com/go-task/task/v3/cmd/task@latest
# Or: brew install go-task (macOS)task # Show all available tasks
task build # Build all packages
task build:cli # Build CLI binary
task install # Install CLI globally
task test # Run all tests
task test:race # Run tests with race detection (CRITICAL)
task test:pkg PKG=./pkg/sql/parser # Test specific package
task bench # Run benchmarks with memory tracking
task coverage # Generate coverage report
task quality # Run fmt, vet, lint
task check # Full suite: format, vet, lint, test:race
task ci # Full CI pipelinego test -v -run TestSpecificName ./pkg/sql/parser/
go test -v -run "TestParser_Window.*" ./pkg/sql/parser/
go test -v -run "TestParser_TupleIn/Basic" ./pkg/sql/parser/ # Run specific subtest./gosqlx validate "SELECT * FROM users"
./gosqlx format -i query.sql
./gosqlx analyze "SELECT COUNT(*) FROM orders GROUP BY status"
./gosqlx parse -f json query.sql
./gosqlx lsp # Start LSP server
./gosqlx lint query.sql # Run linterAlways use defer with pool return functions:
// High-level API (recommended for most use cases)
ast, err := gosqlx.Parse("SELECT * FROM users")
// No cleanup needed - handled automatically
// Low-level API (for fine-grained control)
tkz := tokenizer.GetTokenizer()
defer tokenizer.PutTokenizer(tkz) // MANDATORY
astObj := ast.NewAST()
defer ast.ReleaseAST(astObj) // MANDATORY- Recursive descent with one-token lookahead
- Main file:
pkg/sql/parser/parser.go - Window functions:
parseFunctionCall(),parseWindowSpec(),parseWindowFrame() - CTEs: WITH clause with RECURSIVE support
- Set operations: UNION/EXCEPT/INTERSECT with left-associative parsing
- JOINs: All types with proper left-associative tree logic
- Always check errors from tokenizer and parser
- Errors include position information (
models.Location) - Error codes: E1001-E3004 for tokenizer, parser, semantic errors
- Use
pkg/errors/for structured error creation
Always use the two-value form for type assertions to avoid panics:
stmt, ok := tree.Statements[0].(*ast.SelectStatement)
if !ok {
t.Fatalf("expected SelectStatement, got %T", tree.Statements[0])
}task test:race # Primary method
go test -race -timeout 60s ./... # Direct commandpkg/models/: 100% - All core data structurespkg/sql/ast/: 73.4% - AST nodespkg/sql/tokenizer/: 76.1% - Zero-copy operationspkg/sql/parser/: 76.1% - All SQL featurespkg/errors/: 95.6% - Error handling
task bench # All benchmarks
go test -bench=BenchmarkName -benchmem ./pkg/sql/parser/ # Specific benchmark
go test -bench=. -benchmem -cpuprofile=cpu.prof ./pkg/... # With profiling- Baselines defined in
performance_baselines.jsonat project root - CI environment variability may require baseline adjustments (tolerance %)
- Run locally:
go test -run TestPerformanceRegression ./pkg/sql/parser/ - Skip with race detector (adds 3-5x overhead): automatically skipped
- Update tokens in
pkg/models/token.go(if needed) - Add keywords to
pkg/sql/keywords/(if needed) - Extend AST nodes in
pkg/sql/ast/ - Add parsing logic in
pkg/sql/parser/parser.go - Write comprehensive tests
- Run:
task test:race && task bench - Update CHANGELOG.md
go test -v -run TestTokenizer_YourTest ./pkg/sql/tokenizer/
go test -v -run TestParser_YourTest ./pkg/sql/parser/Use the visitor pattern in pkg/sql/ast/visitor.go to traverse and inspect AST.
CRITICAL: Main branch is protected. Never create tags in feature branches.
# 1. Develop in feature branch
git checkout -b feature/branch-name
# ... make changes, update CHANGELOG.md as [Unreleased] ...
git push origin feature/branch-name
# 2. Create PR and get it merged
# 3. After merge, create docs PR for release finalization
git checkout main && git pull
git checkout -b docs/vX.Y.Z-release
# Update CHANGELOG.md with version and date
git push origin docs/vX.Y.Z-release
# 4. After docs PR merged, create tag
git checkout main && git pull
git tag vX.Y.Z -a -m "vX.Y.Z: Release notes"
git push origin vX.Y.Z
# 5. Create GitHub release
gh release create vX.Y.Z --title "vX.Y.Z: Title" --notes "..."The repository has pre-commit hooks that run:
go fmt- Code formattinggo vet- Static analysisgo test -short- Short test suite
Install with: task hooks:install
docs/GETTING_STARTED.md- Quick start guidedocs/USAGE_GUIDE.md- Comprehensive usage patternsdocs/LSP_GUIDE.md- LSP server and IDE integrationdocs/LINTING_RULES.md- All 30 linting rules referencedocs/SQL_COMPATIBILITY.md- SQL dialect compatibility matrixdocs/ARCHITECTURE.md- Detailed system designhttps://gosqlx.dev- Official website with interactive playgroundhttps://gosqlx.dev/playground/- WASM-powered SQL playground