Skip to main content

Voice Call Implementation

This guide covers implementing voice calls with BlackBox agents using WebRTC over WebSockets. You’ll learn how to handle SDP offers/answers, manage audio streams, and create a complete voice calling experience.

Prerequisites

  • HTTPS Required: WebRTC requires secure connections (HTTPS/WSS)
  • Microphone Access: Browser permissions for microphone access
  • Web Integration Token: Token with AllowWebCall feature enabled
  • WebRTC Support: Modern browser with WebRTC support
HTTPS Required: Voice calls via WebRTC only work over HTTPS. HTTP connections can only use text chat. For local development, use localhost (treated as secure) or set up HTTPS.

Overview

Voice calls use WebRTC (Web Real-Time Communication) for peer-to-peer audio streaming:
  1. WebSocket Connection: Established for signaling and control
  2. SDP Exchange: Session Description Protocol for negotiating audio codecs
  3. ICE Candidates: Internet Connectivity Establishment for NAT traversal
  4. Audio Streaming: Real-time bidirectional audio via WebRTC

Step 1: Setup WebRTC Connection

Connect via WebSocket and handle SDP exchange manually:
let peerConnection: RTCPeerConnection | null = null;
let localStream: MediaStream | null = null;
let ws: WebSocket | null = null;

// Connect to WebSocket
ws = new WebSocket(
  'wss://blackbox.dasha.ai/api/v1/ws/webCall?token=YOUR_WEB_INTEGRATION_TOKEN'
);

ws.onopen = () => {
  // Send initialization message
  ws!.send(JSON.stringify({
    type: 'initialize',
    timestamp: new Date().toISOString(),
    request: {
      callType: 'webCall',
      additionalData: {}
    }
  }));
};

ws.onmessage = async (event) => {
  const message = JSON.parse(event.data);
  
  if (message.type === 'sdpInvite') {
    // Handle WebRTC SDP offer
    const sdpAnswer = await handleSdpOffer(message.data.invite);
    
    // Send SDP answer back
    ws!.send(JSON.stringify({
      type: 'sdpAnswer',
      timestamp: new Date().toISOString(),
      data: sdpAnswer
    }));
  } else if (message.type === 'event' && message.name === 'connection') {
    console.log('Connection established');
  } else if (message.type === 'error') {
    console.error('Error:', message);
  }
};

async function handleSdpOffer(sdpInvite: string): Promise<{ sdpAnswer: string }> {
  try {
    // Create RTCPeerConnection
    peerConnection = new RTCPeerConnection({
      iceServers: [
        { urls: 'stun:stun.l.google.com:19302' }
      ]
    });

    // Get user's microphone
    localStream = await navigator.mediaDevices.getUserMedia({
      audio: {
        echoCancellation: true,
        noiseSuppression: true,
        autoGainControl: true
      }
    });

    // Add audio tracks to peer connection
    localStream.getTracks().forEach(track => {
      peerConnection!.addTrack(track, localStream!);
    });

    // Set remote description (offer from server)
    await peerConnection.setRemoteDescription({
      type: 'offer',
      sdp: sdpInvite
    });

    // Create answer
    const answer = await peerConnection.createAnswer();
    await peerConnection.setLocalDescription(answer);

    // Wait for ICE gathering to complete
    await new Promise<void>((resolve) => {
      if (peerConnection!.iceGatheringState === 'complete') {
        resolve();
      } else {
        peerConnection!.onicegatheringstatechange = () => {
          if (peerConnection!.iceGatheringState === 'complete') {
            resolve();
          }
        };
      }
      // Timeout after 2 seconds
      setTimeout(resolve, 2000);
    });

    // Get SDP answer
    const sdpAnswer = peerConnection.localDescription?.sdp || '';
    
    // Handle remote audio stream
    peerConnection.ontrack = (event) => {
      const remoteStream = event.streams[0];
      playRemoteAudio(remoteStream);
    };

    // Handle connection state changes
    peerConnection.onconnectionstatechange = () => {
      console.log('WebRTC state:', peerConnection?.connectionState);
      if (peerConnection?.connectionState === 'failed') {
        // Handle connection failure
        handleConnectionFailure();
      }
    };

    return { sdpAnswer };
  } catch (error) {
    console.error('SDP handling error:', error);
    throw error;
  }
}

function playRemoteAudio(stream: MediaStream) {
  const audio = new Audio();
  audio.srcObject = stream;
  audio.play().catch(err => {
    console.error('Error playing audio:', err);
  });
}

Step 2: Complete Voice Call Example

Here’s a complete React component for voice calls:
import React, { useState, useEffect, useRef } from 'react';

export function VoiceCallWidget() {
  const [isConnected, setIsConnected] = useState(false);
  const [isCalling, setIsCalling] = useState(false);
  const [connectionStatus, setConnectionStatus] = useState<string>('closed');
  const wsRef = useRef<WebSocket | null>(null);
  const peerConnectionRef = useRef<RTCPeerConnection | null>(null);
  const localStreamRef = useRef<MediaStream | null>(null);
  const remoteAudioRef = useRef<HTMLAudioElement | null>(null);

  useEffect(() => {
    // Create audio element for remote audio
    remoteAudioRef.current = new Audio();
    
    return () => {
      // Cleanup
      stopCall();
    };
  }, []);

  const startCall = async () => {
    try {
      setIsCalling(true);
      
      // Request microphone access
      const stream = await navigator.mediaDevices.getUserMedia({
        audio: {
          echoCancellation: true,
          noiseSuppression: true,
          autoGainControl: true,
          sampleRate: 16000
        }
      });
      
      localStreamRef.current = stream;

      // Create WebSocket connection
      const ws = new WebSocket(
        'wss://blackbox.dasha.ai/api/v1/ws/webCall?token=YOUR_WEB_INTEGRATION_TOKEN'
      );
      wsRef.current = ws;

      ws.onopen = () => {
        setConnectionStatus('connecting');
        // Send initialization message
        ws.send(JSON.stringify({
          type: 'initialize',
          timestamp: new Date().toISOString(),
          request: {
            callType: 'webCall',
            additionalData: {}
          }
        }));
      };

      ws.onmessage = async (event) => {
        const message = JSON.parse(event.data);
        
        if (message.type === 'sdpInvite') {
          // Handle SDP offer
          const sdpAnswer = await handleSdpOffer(message.data.invite);
          
          // Send SDP answer
          ws.send(JSON.stringify({
            type: 'sdpAnswer',
            timestamp: new Date().toISOString(),
            data: sdpAnswer
          }));
        } else if (message.type === 'event' && message.name === 'connection') {
          setConnectionStatus('open');
          setIsConnected(true);
        } else if (message.type === 'text') {
          // Handle text messages during call (transcript)
          console.log('Agent said:', message.content.text);
        } else if (message.type === 'error') {
          console.error('Call error:', message);
          stopCall();
        }
      };

      ws.onerror = () => {
        setConnectionStatus('error');
        stopCall();
      };

      ws.onclose = () => {
        setConnectionStatus('closed');
        setIsConnected(false);
      };
    } catch (error) {
      console.error('Failed to start call:', error);
      setIsCalling(false);
      alert('Failed to start call. Please check microphone permissions.');
    }
  };

  const handleSdpOffer = async (sdpInvite: string): Promise<{ sdpAnswer: string }> => {
    try {
      // Create RTCPeerConnection
      const pc = new RTCPeerConnection({
        iceServers: [
          { urls: 'stun:stun.l.google.com:19302' },
          { urls: 'stun:stun1.l.google.com:19302' }
        ]
      });

      peerConnectionRef.current = pc;

      // Add local audio tracks
      if (localStreamRef.current) {
        localStreamRef.current.getTracks().forEach(track => {
          pc.addTrack(track, localStreamRef.current!);
        });
      }

      // Handle remote audio stream
      pc.ontrack = (event) => {
        const remoteStream = event.streams[0];
        if (remoteAudioRef.current) {
          remoteAudioRef.current.srcObject = remoteStream;
          remoteAudioRef.current.play().catch(err => {
            console.error('Error playing remote audio:', err);
          });
        }
      };

      // Handle ICE candidates (optional - handled by browser)
      pc.onicecandidate = (event) => {
        if (event.candidate) {
          // ICE candidates are automatically handled
          console.log('ICE candidate:', event.candidate);
        }
      };

      // Handle connection state
      pc.onconnectionstatechange = () => {
        const state = pc.connectionState;
        console.log('WebRTC connection state:', state);
        
        if (state === 'failed' || state === 'disconnected') {
          console.warn('WebRTC connection lost');
        } else if (state === 'connected') {
          console.log('WebRTC connected - audio streaming');
        }
      };

      // Set remote description
      await pc.setRemoteDescription({
        type: 'offer',
        sdp: sdpInvite
      });

      // Create answer
      const answer = await pc.createAnswer({
        offerToReceiveAudio: true,
        offerToReceiveVideo: false
      });

      await pc.setLocalDescription(answer);

      // Wait for ICE gathering
      await new Promise<void>((resolve) => {
        if (pc.iceGatheringState === 'complete') {
          resolve();
        } else {
          const checkState = () => {
            if (pc.iceGatheringState === 'complete') {
              pc.removeEventListener('icegatheringstatechange', checkState);
              resolve();
            }
          };
          pc.addEventListener('icegatheringstatechange', checkState);
          // Timeout after 2 seconds
          setTimeout(() => {
            pc.removeEventListener('icegatheringstatechange', checkState);
            resolve();
          }, 2000);
        }
      });

      const sdpAnswer = pc.localDescription?.sdp || '';
      return { sdpAnswer };
    } catch (error) {
      console.error('SDP handling error:', error);
      throw error;
    }
  };

  const stopCall = () => {
    // Stop local stream
    if (localStreamRef.current) {
      localStreamRef.current.getTracks().forEach(track => track.stop());
      localStreamRef.current = null;
    }

    // Close peer connection
    if (peerConnectionRef.current) {
      peerConnectionRef.current.close();
      peerConnectionRef.current = null;
    }

    // Stop remote audio
    if (remoteAudioRef.current) {
      remoteAudioRef.current.pause();
      remoteAudioRef.current.srcObject = null;
    }

    // Close WebSocket connection
    if (wsRef.current) {
      wsRef.current.send(JSON.stringify({
        type: 'terminate',
        timestamp: new Date().toISOString()
      }));
      wsRef.current.close();
      wsRef.current = null;
    }

    setIsCalling(false);
    setIsConnected(false);
    setConnectionStatus('closed');
  };

  return (
    <div className="voice-call-widget">
      <div className="call-status">
        Status: {connectionStatus}
        {isConnected && <span className="indicator">●</span>}
      </div>

      {!isCalling ? (
        <button onClick={startCall} disabled={!navigator.mediaDevices}>
          Start Voice Call
        </button>
      ) : (
        <div className="call-controls">
          <button onClick={stopCall} className="hang-up">
            Hang Up
          </button>
          <div className="call-info">
            {isConnected ? 'Call Connected' : 'Connecting...'}
          </div>
        </div>
      )}

      {localStreamRef.current && (
        <div className="local-audio-indicator">
          🎤 Microphone Active
        </div>
      )}
    </div>
  );
}

Step 3: Handle Audio Quality

Optimize audio quality for better call experience:
// Request high-quality audio
const stream = await navigator.mediaDevices.getUserMedia({
  audio: {
    echoCancellation: true,      // Remove echo
    noiseSuppression: true,      // Reduce background noise
    autoGainControl: true,       // Normalize volume
    sampleRate: 48000,           // High sample rate
    channelCount: 1,             // Mono (sufficient for voice)
    latency: 0.01                 // Low latency
  }
});

// Monitor audio levels
const audioContext = new AudioContext();
const analyser = audioContext.createAnalyser();
const microphone = audioContext.createMediaStreamSource(stream);
microphone.connect(analyser);

function checkAudioLevel() {
  const dataArray = new Uint8Array(analyser.frequencyBinCount);
  analyser.getByteFrequencyData(dataArray);
  const average = dataArray.reduce((a, b) => a + b) / dataArray.length;
  
  if (average < 10) {
    console.warn('Low audio input detected');
  }
  
  requestAnimationFrame(checkAudioLevel);
}
checkAudioLevel();

Step 4: Handle Connection States

Monitor WebRTC connection states:
peerConnection.onconnectionstatechange = () => {
  const state = peerConnection.connectionState;
  
  switch (state) {
    case 'new':
      console.log('WebRTC: New connection');
      break;
    case 'connecting':
      console.log('WebRTC: Connecting...');
      showStatus('Connecting audio...');
      break;
    case 'connected':
      console.log('WebRTC: Connected');
      showStatus('Call connected');
      break;
    case 'disconnected':
      console.log('WebRTC: Disconnected');
      showStatus('Audio disconnected');
      // Attempt reconnection
      break;
    case 'failed':
      console.error('WebRTC: Connection failed');
      showStatus('Call failed');
      handleCallFailure();
      break;
    case 'closed':
      console.log('WebRTC: Closed');
      break;
  }
};

Step 5: Error Handling

Handle common WebRTC errors:
// Microphone permission denied
try {
  const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
} catch (error) {
  if (error.name === 'NotAllowedError') {
    alert('Microphone permission denied. Please allow microphone access.');
  } else if (error.name === 'NotFoundError') {
    alert('No microphone found. Please connect a microphone.');
  } else {
    alert('Error accessing microphone: ' + error.message);
  }
}

// WebRTC connection failure
peerConnection.oniceconnectionstatechange = () => {
  if (peerConnection.iceConnectionState === 'failed') {
    console.error('ICE connection failed');
    // Attempt to restart ICE
    peerConnection.restartIce();
  }
};

// Handle SDP errors
try {
  await peerConnection.setRemoteDescription({ type: 'offer', sdp: sdpInvite });
} catch (error) {
  console.error('Failed to set remote description:', error);
  // Send error to server or retry
}

Step 6: Mute/Unmute

Implement mute functionality:
let isMuted = false;

function toggleMute() {
  if (localStreamRef.current) {
    localStreamRef.current.getAudioTracks().forEach(track => {
      track.enabled = isMuted;
    });
    isMuted = !isMuted;
    updateMuteButton(isMuted);
  }
}

// Mute button
<button onClick={toggleMute}>
  {isMuted ? '🔇 Unmute' : '🎤 Mute'}
</button>

Troubleshooting

No Audio Output

Symptoms: Call connects but no sound from agent Solutions:
  1. Check browser audio permissions
  2. Verify remoteAudio.play() is called
  3. Check system volume settings
  4. Ensure audio element is not muted
  5. Check browser console for audio errors

No Audio Input

Symptoms: Agent can’t hear you Solutions:
  1. Check microphone permissions
  2. Verify microphone is not muted in system settings
  3. Check getUserMedia succeeded
  4. Verify audio tracks are added to peer connection
  5. Test microphone in browser settings

Connection Fails

Symptoms: WebRTC connection never establishes Solutions:
  1. Check firewall allows WebRTC traffic
  2. Verify STUN server is accessible
  3. Check network supports WebRTC
  4. Try different STUN servers
  5. Check browser WebRTC support

High Latency

Symptoms: Noticeable delay in audio Solutions:
  1. Use lower latency audio settings
  2. Check network connection quality
  3. Use closer STUN/TURN servers
  4. Optimize audio codec settings
  5. Check for network congestion

Next Steps