A working demonstration of real-time bidirectional streaming with Google's Agent Development Kit (ADK). This FastAPI application showcases WebSocket-based communication with Gemini models, supporting multimodal requests (text, audio, and image/video input) and flexible responses (text or audio output).
This demo implements the complete ADK bidirectional streaming lifecycle:
- Application Initialization: Creates
Agent,SessionService, andRunnerat startup - Session Initialization: Establishes
Session,RunConfig, andLiveRequestQueueper connection - Bidirectional Streaming: Concurrent upstream (client → queue) and downstream (events → client) tasks
- Graceful Termination: Proper cleanup of
LiveRequestQueueand WebSocket connections
- WebSocket Communication: Real-time bidirectional streaming via
/ws/{user_id}/{session_id} - Multimodal Requests: Text, audio, and image/video input with automatic audio transcription
- Flexible Responses: Text or audio output, automatically determined based on model architecture
- Session Resumption: Reconnection support configured via
RunConfig - Concurrent Tasks: Separate upstream/downstream async tasks for optimal performance
- Interactive UI: Web interface with event console for monitoring Live API events
- Google Search Integration: Agent equipped with
google_searchtool
The application follows ADK's recommended concurrent task pattern:
┌─────────────┐ ┌──────────────────┐ ┌─────────────┐
│ │ │ │ │ │
│ WebSocket │────────▶│ LiveRequestQueue │────────▶│ Live API │
│ Client │ │ │ │ Session │
│ │◀────────│ run_live() │◀────────│ │
└─────────────┘ └──────────────────┘ └─────────────┘
Upstream Task Queue Downstream Task
- Upstream Task: Receives WebSocket messages and forwards to
LiveRequestQueue - Downstream Task: Processes
run_live()events and sends to WebSocket client
- Python 3.10 or higher
- uv (recommended) or pip
- Google API key (for Gemini Live API) or Google Cloud project (for Vertex AI Live API)
Installing uv (if not already installed):
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"Use the Agent Starter Pack to create a production-ready version of this agent with additional deployment options. The easiest way is with uvx (no install needed).
uvx agent-starter-pack create my-bidi-demo -a adk@bidi-demoAlternative: Using pip and a virtual environment
# Create and activate a virtual environment
python -m venv .venv && source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install the starter pack and create your project
pip install --upgrade agent-starter-pack
agent-starter-pack create my-bidi-demo -a adk@bidi-demoThe starter pack will prompt you to select deployment options and provides additional production-ready features including automated CI/CD deployment scripts.
cd src/bidi-demoUsing uv (recommended):
uv syncThis automatically creates a virtual environment, installs all dependencies, and generates a lock file for reproducible builds.
Using pip (alternative):
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -e .Create or edit app/.env with your credentials:
# Choose your Live API platform
GOOGLE_GENAI_USE_VERTEXAI=FALSE
# For Gemini Live API (when GOOGLE_GENAI_USE_VERTEXAI=FALSE)
GOOGLE_API_KEY=your_api_key_here
# For Vertex AI Live API (when GOOGLE_GENAI_USE_VERTEXAI=TRUE)
# GOOGLE_CLOUD_PROJECT=your_project_id
# GOOGLE_CLOUD_LOCATION=us-central1
# Model selection (optional, defaults to native audio model)
# See "Supported Models" section below for available model names
DEMO_AGENT_MODEL=gemini-2.5-flash-native-audio-preview-12-2025Gemini Live API:
- Visit Google AI Studio
- Create an API key
- Set
GOOGLE_API_KEYin.env
Vertex AI Live API:
- Enable Vertex AI API in Google Cloud Console
- Set up authentication using
gcloud auth application-default login - Set
GOOGLE_CLOUD_PROJECTandGOOGLE_CLOUD_LOCATIONin.env - Set
GOOGLE_GENAI_USE_VERTEXAI=TRUE
Set the SSL certificate file path for secure connections:
# If using uv
export SSL_CERT_FILE=$(uv run python -m certifi)
# If using pip with activated venv
export SSL_CERT_FILE=$(python -m certifi)From the src/bidi-demo directory, first change to the app subdirectory:
cd appNote: You must run from inside the
appdirectory so Python can find thegoogle_search_agentmodule. Running from the parent directory will fail withModuleNotFoundError: No module named 'google_search_agent'.
Using uv (recommended):
uv run --project .. uvicorn main:app --reload --host 0.0.0.0 --port 8000Using pip (with activated venv):
uvicorn main:app --reload --host 0.0.0.0 --port 8000The --reload flag enables auto-restart on code changes during development.
To run in background with log output:
# Using uv (from app directory)
uv run --project .. uvicorn main:app --host 0.0.0.0 --port 8000 > server.log 2>&1 &
# Using pip (from app directory)
uvicorn main:app --host 0.0.0.0 --port 8000 > server.log 2>&1 &To check the server log:
tail -f server.log # Follow log in real-timeTo stop the background server:
kill $(lsof -ti:8000)Open your browser and navigate to:
http://localhost:8000
- Type your message in the input field
- Click "Send" or press Enter
- Watch the event console for Live API events
- Receive streamed responses in real-time
- Click "Start Audio" to begin voice interaction
- Speak into your microphone
- Receive audio responses with real-time transcription
- Click "Stop Audio" to end the audio session
ws://localhost:8000/ws/{user_id}/{session_id}
Path Parameters:
user_id: Unique identifier for the usersession_id: Unique identifier for the session
Response Modality:
- Automatically determined based on model architecture
- Native audio models use AUDIO response modality
- Half-cascade models use TEXT response modality
Client → Server (Text):
{
"type": "text",
"text": "Your message here"
}Client → Server (Image):
{
"type": "image",
"data": "base64_encoded_image_data",
"mimeType": "image/jpeg"
}Client → Server (Audio):
- Send raw binary frames (PCM audio, 16kHz, 16-bit)
Server → Client:
- JSON-encoded ADK
Eventobjects - See ADK Events Documentation for event schemas
bidi-demo/
├── app/
│ ├── google_search_agent/ # Agent definition module
│ │ ├── __init__.py # Package exports
│ │ └── agent.py # Agent configuration
│ ├── main.py # FastAPI application and WebSocket endpoint
│ ├── .env # Environment configuration (not in git)
│ └── static/ # Frontend files
│ ├── index.html # Main UI
│ ├── css/
│ │ └── style.css # Styling
│ └── js/
│ ├── app.js # Main application logic
│ ├── audio-player.js # Audio playback
│ ├── audio-recorder.js # Audio recording
│ ├── pcm-player-processor.js # Audio processing
│ └── pcm-recorder-processor.js # Audio processing
├── tests/ # E2E tests and test logs
├── pyproject.toml # Python project configuration
└── README.md # This file
The agent is defined in a separate module following ADK best practices:
agent = Agent(
name="google_search_agent",
model=os.getenv("DEMO_AGENT_MODEL", "gemini-2.5-flash-native-audio-preview-12-2025"),
tools=[google_search],
instruction="You are a helpful assistant that can search the web."
)from google_search_agent.agent import agent
app = FastAPI()
session_service = InMemorySessionService()
runner = Runner(app_name="bidi-demo", agent=agent, session_service=session_service)The WebSocket endpoint implements the complete bidirectional streaming pattern:
- Accept Connection: Establish WebSocket connection
- Configure Session: Create
RunConfigwith automatic modality detection - Initialize Queue: Create
LiveRequestQueuefor message passing - Start Concurrent Tasks: Launch upstream and downstream tasks
- Handle Cleanup: Close queue in
finallyblock
Upstream Task (app/main.py:125-172):
- Receives WebSocket messages (text, image, or audio binary)
- Converts to ADK format (
ContentorBlob) - Sends to
LiveRequestQueueviasend_content()orsend_realtime()
Downstream Task (app/main.py:174-187):
- Calls
runner.run_live()with queue and config - Receives
Eventstream from Live API - Serializes events to JSON and sends to WebSocket
The demo supports any Gemini model compatible with Live API:
Native Audio Models (recommended for voice):
gemini-2.5-flash-native-audio-preview-12-2025(Gemini Live API)gemini-live-2.5-flash-native-audio(Vertex AI)
Set the model via DEMO_AGENT_MODEL in .env or modify app/google_search_agent/agent.py.
For the latest model availability and features:
- Gemini Live API: Check the official Gemini API models documentation
- Vertex AI Live API: Check the official Vertex AI models documentation
The demo automatically configures bidirectional streaming based on model architecture (app/main.py:76-104):
For Native Audio Models (containing "native-audio" in model name):
run_config = RunConfig(
streaming_mode=StreamingMode.BIDI,
response_modalities=["AUDIO"],
input_audio_transcription=types.AudioTranscriptionConfig(),
output_audio_transcription=types.AudioTranscriptionConfig(),
session_resumption=types.SessionResumptionConfig()
)For Half-Cascade Models (other models):
run_config = RunConfig(
streaming_mode=StreamingMode.BIDI,
response_modalities=["TEXT"],
input_audio_transcription=None,
output_audio_transcription=None,
session_resumption=types.SessionResumptionConfig()
)The modality detection is automatic based on the model name. Native audio models use AUDIO response modality with transcription enabled, while half-cascade models use TEXT response modality for better performance.
Problem: WebSocket fails to connect
Solutions:
- Verify API credentials in
app/.env - Check console for error messages
- Ensure uvicorn is running on correct port
Problem: Audio input/output not functioning
Solutions:
- Grant microphone permissions in browser
- Verify browser supports Web Audio API
- Check that audio model is configured (native audio model required)
- Review browser console for errors
Problem: "Model not found" or quota errors
Solutions:
- Verify model name matches your platform (Gemini vs Vertex AI)
- Check API quota limits in console
- Ensure billing is enabled (for Vertex AI)
This project uses black, isort, and flake8 for code formatting and linting. Configuration is inherited from the repository root.
Using uv:
uv run black .
uv run isort .
uv run flake8 .Using pip (with activated venv):
black .
isort .
flake8 .To check formatting without making changes:
# Using uv
uv run black --check .
uv run isort --check .
# Using pip
black --check .
isort --check .- ADK Documentation: https://google.github.io/adk-docs/
- Gemini Live API: https://ai.google.dev/gemini-api/docs/live
- Vertex AI Live API: https://cloud.google.com/vertex-ai/generative-ai/docs/live-api
- ADK GitHub Repository: https://github.com/google/adk-python
Apache 2.0 - See repository LICENSE file for details.
