This repository provides a full-stack, reusable template for building real-time, multimodal conversational AI agents. It uses leverages the Agent Development Kit (ADK) on a FastAPI backend and a Next.js (React) frontend.
This template handles all the complex WebSocket and media streaming "plumbing," allowing you to focus on what matters: building your agent's logic and UI.
| Feature | Description |
|---|---|
| Interaction Type | Conversational |
| Complexity | Medium |
| Agent Type | Single Agent |
| Components | Live API (Native Audio) |
| Vertical | Reusable across industry verticals |
- Real-Time, Bidirectional Audio: Streams user microphone input to the agent and streams the agent's synthesized voice back to the client.
- Live Video Feed: Supports streaming from a user's camera or screen share for the agent to analyze.
- Live Transcriptions: Displays a real-time transcript of both the user's speech (STT) and the agent's speech (TTS).
- Configurable Persona: The agent's identity and instructions are easily configured.
This template is the foundation for any application that requires an AI to see, hear, and talk in real-time.
- Real-Time Tutors: An agent that can watch you solve a math problem (via screen share) and talk you through it. This is the default behavior of this example agent.
- Live Customer Support: An agent that can visually guide a user through a website or product setup.
- Accessibility Tools: A "be my eyes" agent that can describe a user's surroundings or the content of their screen.
- Interactive Assistant: An agent that pairs with you, watches you work, and provides real-time feedback or assistance.
-
Python 3.11+
-
Node.js v22+
-
uv for dependency management and packaging
- See the official uv website for installation.
curl -LsSf https://astral.sh/uv/install.sh | sh
Use the Agent Starter Pack to scaffold a production-ready project and choose your deployment target (Vertex AI Agent Engine or Cloud Run), with CI/CD and other production features. The easiest way is with uv (one command, no venv or pip install needed):
uvx agent-starter-pack create my-realtime-agent -a adk@realtime-conversational-agentIf you don't have uv yet: curl -LsSf https://astral.sh/uv/install.sh | sh
The starter pack will prompt you to select deployment options and set up your Google Cloud project.
Alternative: Using pip and a virtual environment
# Create and activate a virtual environment
python -m venv .venv && source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install the starter pack and create your project
pip install --upgrade agent-starter-pack
agent-starter-pack create my-realtime-agent -a adk@realtime-conversational-agentAlternative: Local development (run from this sample repo)
Follow these instructions to get the client and server running on your local machine.
-
Node.js: v22 or later
-
Python: 3.11 or later
-
uv(Recommended): Python package managerpip install uv -
Platform setup
Choose a platform from either Google AI Studio or Google Cloud Vertex AI:
- Option 1: Google AI Studio (Default)
- Get a
GOOGLE_API_KEYfrom Google AI Studio. This is the simplest way to get started.
- Get a
- Option 2: Google Cloud / Vertex AI
- Create a Google Cloud Project and enable the Vertex AI API.
- Install the Google Cloud CLI.
- Authenticate your local environment by running:
gcloud auth login
- Option 1: Google AI Studio (Default)
The server handles the AI agent logic and WebSocket connections.
-
Navigate to the server directory:
cd server -
Create and activate a virtual environment:
uv venv source .venv/bin/activate -
Install Python dependencies:
uv pip install . -
Set SSL Certificate File:
export SSL_CERT_FILE=$(python -m certifi) -
Create your environment file:
Create a new file named .env in the server/ directory. Use one of the two templates below based on your authentication method.
server/.env(Option 1: Google AI Studio)# --- AI Studio --- GOOGLE_GENAI_USE_VERTEXAI=FALSE # Get this from Google AI Studio GOOGLE_API_KEY="PASTE_YOUR_ACTUAL_API_KEY_HERE" # Configuration for the agent's voice (Example: 'Puck' for gemini-live-2.5-flash) AGENT_VOICE="Puck" AGENT_LANGUAGE="en-US"server/.env(Option 2: Google Cloud / Vertex AI)# --- Vertex AI --- GOOGLE_GENAI_USE_VERTEXAI=TRUE # Your Google Cloud project ID GOOGLE_CLOUD_PROJECT="PASTE_YOUR_ACTUAL_PROJECT_ID" # Your Vertex AI location (e.g., us-central1) GOOGLE_CLOUD_LOCATION="us-central1" # Configuration for the agent's voice (Example: 'Puck' for gemini-live-2.5-flash) AGENT_VOICE="Puck" AGENT_LANGUAGE="en-US" -
Run the server:
uvicorn main:app --reloadThe server will be running at
http://127.0.0.1:8000.
The client is the Next.js application that the user interacts with.
-
Open a new terminal and navigate to the client directory:
cd client -
Install Node.js dependencies:
npm install -
Run the client:
npm run devThe client will be running at
http://localhost:3000. -
Open the app:
Open
http://localhost:3000in your browser. You can now click the microphone icon to start a session!
This repository is designed for easy reuse. You don't need to change any Python code to completely change your agent's persona.
Simply edit the AGENT_INSTRUCTION in your server/example_agent/prompts.py file.
To turn your "Math Tutor" agent into a "Generic Assistant", stop your server, replace the AGENT_INSTRUCTION in server/example_agent/prompts.py with the following, and restart the server.
AGENT_INSTRUCTION="You are a helpful and friendly AI assistant. Keep your responses concise."
This agent sample is provided for illustrative purposes only and is not intended for production use. It serves as a basic example of an agent and a foundational starting point for individuals or teams to develop their own agents.
This sample has not been rigorously tested, may contain bugs or limitations, and does not include features or optimizations typically required for a production environment (e.g., robust error handling, security measures, scalability, performance considerations, comprehensive logging, or advanced configuration options).
Users are solely responsible for any further development, testing, security hardening, and deployment of agents based on this sample. We recommend thorough review, testing, and the implementation of appropriate safeguards before using any derived agent in a live or critical system.
