Semantic movie search powered by Retrieval-Augmented Generation.
Live demo: sah.codes/film
A personal playground for experimenting with RAG strategies and LLM-assisted development workflows. The code is intentionally abstract in places to support easily swapping parts.
Evolution:
- Vector DB: Milvus (after evaluating ChromaDB, Qdrant)
- LLM: Gemini (Claude was to respond)
- Metadata: SQLite → PostgreSQL
- Data source: MovieLens → TMDB/IMDB
- Ratings: OMDB API → tmdb API / scraping
- Frontend: Tried Astro/Vue.js, settled on htmx — better for LLM-driven development
The RAG config optimization approach may be worth extracting into a standalone library.
- Unified RAG pipeline: Single configurable strategy with tunable parameters
- Hybrid search: Dense embeddings + BM25 sparse vectors with RRF/weighted fusion
- HyDE: Hypothetical Document Embeddings for query expansion
- Cross-encoder reranking: Quality-based result refinement
- Parameter optimization: Genetic algorithm, simulated annealing, and Optuna-based tuning against ground truth
- Swappable vector backends: Milvus (default), ChromaDB, Qdrant
- Metadata enrichment: On-demand data from TMDB/IMDB
The pipeline is controlled by RAGConfig with these tunable parameters:
| Parameter | Default | Description |
|---|---|---|
rrf_k |
60 | Reciprocal Rank Fusion constant |
use_cross_encoder |
true | Enable reranking |
hyde_enabled |
true | Enable HyDE query expansion |
milvus_ranker_type |
weighted | Fusion method: rrf or weighted |
milvus_dense_weight |
0.8 | Dense vector weight |
milvus_sparse_weight |
0.2 | BM25 sparse weight |
rating_weight |
0.1 | Quality boost from ratings |
popularity_weight |
0.1 | Quality boost from popularity |
- Python 3.14+ due to uuid7 usage.
- API keys for Claude or Gemini (required for HyDE) and TMDB (optional)
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txtcp .env.example .envRequired environment variables:
| Variable | Description |
|---|---|
ANTHROPIC_API_KEY |
Claude API key for HyDE |
ADMIN_TOKEN |
Bearer token for admin endpoints |
TMDB_API_KEY |
TMDB API key (optional, for enrichment) |
uvicorn src.app:app --reload- UI: http://localhost:8000/
- API docs: http://localhost:8000/docs
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{"query": "time travel movies", "top_k": 10}'Tune RAG parameters against your ground truth:
python scripts/run_optuna.pyGround truth format: see ground_truth.example.json
backend/src/lib/
├── rag/ # Strategy, HyDE, merging, reranking
├── vectordb/ # Protocol-based DB clients
├── metadata/ # PostgreSQL store + enrichment
├── llm/ # Claude, Gemini clients
├── optimization/ # Parameter optimizers
└── rag_config.py # Configuration dataclass
BSD 3-Clause License. See LICENSE.