WebSockets Overview
BlackBox provides a WebSocket-based API for real-time bidirectional communication with AI agents. This protocol enables instant messaging, voice calls via WebRTC, tool execution, and event streaming—perfect for building custom integrations, dashboards, or embedded experiences that require low-latency, persistent connections.What are WebSockets?
WebSockets provide a persistent, full-duplex communication channel between your application and BlackBox agents. Unlike traditional HTTP requests that require polling, WebSockets maintain an open connection that allows both sides to send messages at any time. Core Characteristics:- Persistent Connection: Single connection maintained throughout the conversation
- Bidirectional: Both client and server can send messages independently
- Low Latency: Real-time message delivery without HTTP overhead
- Event-Driven: Messages arrive as events occur, not on a polling schedule
- Protocol Support: Handles text chat, voice calls (WebRTC), tool calls, and events
How WebSockets Work
The WebSocket connection lifecycle varies by call type. Below are the flows for each type:1. Text Conversation (Chat)
For text-based conversations without voice:2. WebRTC Voice Calls (webCall)
For browser-based voice calls using WebRTC:3. Phone Calls (onPhone)
For traditional phone calls via SIP/PBX: Common Flow Steps:- Connection: Establish WebSocket connection to BlackBox server
- Initialization: Send initialization message with call type (
chat,webCall, oronPhone) and configuration - Session Ready: Receive connection event indicating agent is ready
- Communication: Exchange messages bidirectionally (text, voice, events)
- Termination: Send terminate message or handle connection close
Why Use WebSockets?
Real-Time Communication
Instant Messaging- Messages delivered immediately without polling delays
- Typing indicators and real-time status updates
- Stream of consciousness responses as agent generates them
- Low latency for conversational experiences
- WebRTC integration for browser-based voice calls
- No phone numbers required for web-based calls
- Real-time audio streaming with minimal delay
- High-quality audio codecs (Opus, G.711)
Efficient Resource Usage
Single Connection- One WebSocket connection handles all communication
- No need for multiple HTTP requests per message
- Reduced server load and bandwidth usage
- Lower latency compared to REST API polling
- Receive events as they occur (tool calls, errors, status changes)
- No need to poll for updates
- Efficient for monitoring and debugging
- Real-time conversation state synchronization
Rich Integration Capabilities
Tool Execution- Agents can call tools/functions via WebSocket
- Receive tool results in real-time
- Handle tool execution errors immediately
- Support for async tool operations
- Full conversation history streamed
- Transcript access during and after calls
- Conversation result payloads
- Event history for debugging
WebSocket vs REST API
Understanding when to use WebSockets versus REST API:- Use WebSockets For
- Use REST API For
Real-Time Communication
- Chat conversations
- Voice calls
- Live event streaming
- Tool execution during conversations
- Status updates and notifications
- Multi-turn conversations
- Long-running interactions
- Stateful communication
- Session management
- Interactive experiences
- Voice calls
- Real-time dashboards
- Live monitoring
Connection Endpoints
BlackBox provides different WebSocket endpoints for different use cases:Production Endpoint
Web Call Endpoint (for production web integrations):- Production web widgets
- Customer-facing applications
- Public integrations
- Token-based authentication
Development Endpoint
Dev Endpoint (for testing and development):- Testing agent configurations
- Development and debugging
- Dashboard testing
- Internal tools
Connection Types
BlackBox supports three call types via WebSocket:Chat (chat)
Text-based conversations without voice.
Characteristics:
- Text messages only
- No WebRTC required
- Works over HTTP/HTTPS
- Lower bandwidth usage
- Faster message delivery
- Customer support chat
- FAQ bots
- Lead qualification
- Information retrieval
Web Call (webCall)
Voice calls using WebRTC in the browser.
Characteristics:
- Real-time voice communication
- WebRTC for audio streaming
- Requires HTTPS
- Microphone access needed
- Higher bandwidth usage
- Voice customer support
- Interactive demos
- Voice-first experiences
- Sales conversations
Phone Call (onPhone)
Traditional phone calls via SIP.
Characteristics:
- Phone number required
- SIP protocol
- Works with landlines/mobile
- No browser required
- External telephony provider
- Outbound calling campaigns
- Inbound call centers
- Phone-based support
- Integration with existing phone systems
Message Protocol
All WebSocket messages follow a consistent JSON structure:Message Format
Message Types
Client → Server Messages:initialize- Start a new conversationincomingChatMessage- Send a text messagesdpAnswer- Respond to WebRTC SDP offerwebsocketToolResponse- Return tool execution resultterminate- End the conversation
event- System events (connection, opened, closed, etc.)text- Text messages from agent or usersdpInvite- WebRTC SDP offer for voice callstoolCall- Agent requesting tool executiontoolCallResult- Result of tool executionwebsocketToolRequest- Tool call request via WebSocketconversationResult- Final conversation summaryerror- Error messages
Quick Start
WebSocket Implementation
Connect directly to the WebSocket endpoint for real-time communication with agents:- Browser
- Node.js
Get your token: Generate a web integration token from your agent’s Web Integrations settings in the BlackBox dashboard.
Next Steps
Now that you understand WebSockets basics:- Chat Implementation - Build text chat with agents
- Voice Call Implementation - Add WebRTC voice calls
- Message Reference - Complete message type documentation
- Tool Execution - Handle agent tool calls
- Error Handling - Robust error handling patterns
- Best Practices - Production-ready patterns