multiagent_realtime

Multi-Agent Realtime Voice Interaction Example

This example demonstrates how to use AgentScope's ChatRoom class to create a multi-agent real-time voice interaction system where two AI agents can have autonomous conversations without user input.

Features

🗣️ Real-time Voice Interaction: Two agents communicate through voice in real-time
🤖 Autonomous Conversation: Agents converse with each other without user intervention
⚙️ Customizable Configuration: Configure agent names and instructions through the web interface
🎨 Modern UI: Clean, shadcn-inspired interface for easy interaction
📊 Live Transcript: See the conversation transcripts in real-time

Architecture

The example uses:

Backend: FastAPI server with WebSocket support
Frontend: HTML5 with Web Audio API for audio playback
AgentScope Components:
- ChatRoom: Manages multiple RealtimeAgent instances
- RealtimeAgent: Handles real-time voice interaction with AI models
- DashScopeRealtimeModel: DashScope's Qwen3-Omni realtime model

Prerequisites

Python Dependencies:

pip install agentscope[dashscope]
pip install fastapi uvicorn

DashScope API Key:
- Set your DashScope API key as an environment variable:
```
export DASHSCOPE_API_KEY="your-api-key-here"
```

Usage

Start the Server:
```
python run_server.py
```
Open the Web Interface:
- Navigate to http://localhost:8000 in your web browser
Configure Agents:
- Set names and instructions for both Agent 1 and Agent 2
- Example configurations:
  - Agent 1 (Alice): "You are Alice, a cheerful and optimistic person who loves to share stories and ask questions. Keep your responses brief and conversational."
  - Agent 2 (Bob): "You are Bob, a thoughtful and analytical person who enjoys deep conversations. Keep your responses brief and conversational."
Start the Conversation:
- Click the "▶️ Start Conversation" button
- The agents will begin conversing autonomously
- You'll see transcripts and system messages in the message panel
- Audio playback will stream in real-time
Stop the Conversation:
- Click the "⏹️ Stop Conversation" button when you want to end the session

How It Works

Backend Flow

WebSocket Connection: Client connects via WebSocket to /ws/{user_id}/{session_id}
Session Creation:
- Client sends client_session_create event with agent configurations
- Server creates two RealtimeAgent instances with specified names and instructions
- Server creates a ChatRoom with both agents
- Server starts the chat room and returns session_created event
Message Broadcasting:
- ChatRoom automatically broadcasts messages between agents
- All events (audio, transcripts, etc.) are forwarded to the frontend
Session End: Client sends client_session_end event to stop the conversation

Frontend Flow

WebSocket Setup: Establishes connection and waits for server events
Session Management: Sends configuration and manages conversation state
Audio Playback:
- Receives base64-encoded PCM16 audio chunks
- Decodes and queues audio data
- Uses Web Audio API ScriptProcessorNode for streaming playback at 24kHz
Transcript Display: Shows real-time transcripts from both agents

Key Components

ChatRoom

The ChatRoom class manages multiple RealtimeAgent instances:

Establishes connections for all agents
Broadcasts messages between agents automatically
Forwards events to the frontend
Handles lifecycle management (start/stop)

RealtimeAgent

Each RealtimeAgent:

Connects to the DashScope realtime API
Processes audio input from other agents
Generates voice responses
Emits events for transcripts, audio, and status updates

Customization

Changing the Model

To use a different model, modify the DashScopeRealtimeModel configuration in run_server.py:

model=DashScopeRealtimeModel(
    model_name="your-model-name",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
)

Adding More Agents

To add more agents, modify the agent creation section in run_server.py:

agent3 = RealtimeAgent(
    name=agent3_name,
    sys_prompt=agent3_instructions,
    model=DashScopeRealtimeModel(
        model_name="qwen3-omni-flash-realtime",
        api_key=os.getenv("DASHSCOPE_API_KEY"),
    ),
)

chat_room = ChatRoom(agents=[agent1, agent2, agent3])

And update the frontend to include configuration fields for the additional agents.

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
multi_agent.html		multi_agent.html
run_server.py		run_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Multi-Agent Realtime Voice Interaction Example

Features

Architecture

Prerequisites

Usage

How It Works

Backend Flow

Frontend Flow

Key Components

ChatRoom

RealtimeAgent

Customization

Changing the Model

Adding More Agents

Troubleshooting

No Audio Playback

Connection Issues

Agents Not Responding

References

FilesExpand file tree

multiagent_realtime

Directory actions

More options

Directory actions

More options

Latest commit

History

multiagent_realtime

Folders and files

parent directory

README.md

Multi-Agent Realtime Voice Interaction Example

Features

Architecture

Prerequisites

Usage

How It Works

Backend Flow

Frontend Flow

Key Components

ChatRoom

RealtimeAgent

Customization

Changing the Model

Adding More Agents

Troubleshooting

No Audio Playback

Connection Issues

Agents Not Responding

References