This directory provides a PASA service layout that can be called directly by DeepReviewer:
- Two vLLM OpenAI-compatible inference services (crawler + selector)
- One Flask orchestrator (
pasa_server.py) - One unified start/stop script (
start_pasa_server.sh)
DeepReviewer accesses PASA with:
PAPER_SEARCH_BASE_URL=http://127.0.0.1:8001PAPER_SEARCH_ENDPOINT=/pasa/search
- PASA official repository: https://github.com/bytedance/pasa
- Official README: https://github.com/bytedance/pasa/blob/main/README.md
- Crawler model: https://huggingface.co/bytedance-research/pasa-7b-crawler
- Selector model: https://huggingface.co/bytedance-research/pasa-7b-selector
- Dataset: https://huggingface.co/datasets/CarlanLark/pasa-dataset
- Serper API signup: https://serper.dev/
- Linux + NVIDIA GPU
- Python 3.10+ (3.11 recommended)
- Working CUDA environment (compatible with your PyTorch / vLLM versions)
- Network access to Hugging Face, arXiv, and Serper
Run this in your Python environment:
cd <repo_root>/pasa
pip install --upgrade pip
pip install \
torch transformers \
vllm "openai>=1.52,<1.76" \
flask flask-cors \
requests httpx arxiv \
beautifulsoup4 lxmlNotes:
start_pasa_server.shchecksimport vllmbefore startup.pasa/pasa/utils.pyloads the local paper DB at import time, so configure these file paths first.
Example using huggingface-cli:
# crawler
huggingface-cli download bytedance-research/pasa-7b-crawler \
--local-dir /data/models/pasa-7b-crawler
# selector
huggingface-cli download bytedance-research/pasa-7b-selector \
--local-dir /data/models/pasa-7b-selectorThen point paths in .pasa_env to your local model directories.
The current code reads:
PASA_PAPER_DB(for example:cs_paper_2nd.zip)PASA_PAPER_ID(for example:id2paper.json)
Download them from the official dataset page, store locally, then configure them in .pasa_env.
Recommended setup:
cd <repo_root>/pasa
cp .pasa_env.example .pasa_env.local
vim .pasa_env.localNotes:
pasa_server.pyloads env files in this order:$PASA_ENV_FILE->.pasa_env.local->.pasa_env- Keep machine-specific settings in
.pasa_env.localand do not commit them
Key configuration example:
# GPU
PASA_GPU_ID=1
# Flask server
PASA_SERVER_HOST=0.0.0.0
PASA_SERVER_PORT=8001
# Model paths (must exist)
PASA_CRAWLER_PATH=/data/models/pasa-7b-crawler
PASA_SELECTOR_PATH=/data/models/pasa-7b-selector
PASA_PROMPTS_PATH=pasa/agent_prompt.json
# vLLM service endpoints
PASA_VLLM_HOST=127.0.0.1
PASA_VLLM_CRAWLER_PORT=8101
PASA_VLLM_SELECTOR_PORT=8102
PASA_VLLM_CRAWLER_URL=http://127.0.0.1:8101/v1
PASA_VLLM_SELECTOR_URL=http://127.0.0.1:8102/v1
PASA_VLLM_CRAWLER_MODEL_NAME=pasa-crawler
PASA_VLLM_SELECTOR_MODEL_NAME=pasa-selector
# Serper key (now read from env var; no longer hardcoded)
PASA_SERPER_API_KEY=your_serper_api_key
PASA_SERPER_SEARCH_URL=https://google.serper.dev/search
# Local paper DB (set real paths)
PASA_PAPER_DB=/data/pasa/cs_paper_2nd.zip
PASA_PAPER_ID=/data/pasa/id2paper.jsoncd <repo_root>/pasa
bash start_pasa_server.shcd <repo_root>/pasa
bash start_pasa_server.sh --backgroundcd <repo_root>/pasa
bash start_pasa_server.sh --stopcd <repo_root>/pasa
bash start_pasa_server.sh --restartcurl http://127.0.0.1:8001/healthExpected fields:
"status": "healthy""crawler_ready": true"selector_ready": true
curl -X POST http://127.0.0.1:8001/pasa/search \
-H "Content-Type: application/json" \
-d '{
"query": "Papers about contrastive learning",
"expand_layers": 1,
"search_queries": 2,
"search_papers": 5,
"expand_papers": 5,
"threads_num": 0
}'cd <repo_root>/pasa
python test_pasa_decoupling.pyGET /GET /healthPOST /pasa/searchPOST /pasa/search_asyncGET /pasa/jobs/<job_id>GET /pasa/jobs/<job_id>/resultDELETE /pasa/jobs/<job_id>
Set the following in <repo_root>/.env:
PAPER_SEARCH_BASE_URL=http://127.0.0.1:8001
PAPER_SEARCH_ENDPOINT=/pasa/search
PAPER_SEARCH_API_KEY=vllmimport fails
- Ensure the Python environment used by the startup script has
vllminstalled.
- Model path does not exist
- Check
PASA_CRAWLER_PATHandPASA_SELECTOR_PATHin.pasa_env.
/healthis unhealthy
- Ensure
PASA_VLLM_*_MODEL_NAMEmatches the vLLM--served-model-name. - Check logs:
/tmp/pasa_vllm_crawler.log,/tmp/pasa_vllm_selector.log,/tmp/pasa_server.log.
/pasa/searcherrors or returns empty results
- Ensure
PASA_SERPER_API_KEYis configured correctly. - Ensure your network/proxy can access
google.serper.devand arXiv.
- Importing
pasa/pasa/utils.pyfails on startup
- Usually caused by invalid
PASA_PAPER_DBorPASA_PAPER_IDpaths. Fix paths and restart.