The Streaming Avatar feature allows you to create interactive, real-time avatar experiences in your application. This guide provides a comprehensive walkthrough of integrating streaming avatars using the Agora SDK, including:
Setting up real-time communication channels
Handling avatar interactions and responses
Managing audio streams
Implementing cleanup procedures
Optional LLM service integration
The integration uses Agora’s Real-Time Communication (RTC) SDK for reliable, low-latency streaming and our avatar service for generating responsive avatar behaviors.
import AgoraRTC, { IAgoraRTCClient } from "agora-rtc-sdk-ng";
Add the hidden API of Agora SDK
Agora SDK’s sendStreamMessage is not exposed, so we need to add it manually. And it has some limitations, so we need to handle it carefully.We can infer from the doc that the message size is limited to 1KB and the message frequency is limited to 6KB per second.The Agora SDK’s sendStreamMessage method needs to be manually added to the type definitions:
Security Recommendation: We strongly recommend implementing session management through your backend server rather than directly in the browser. This approach:
Protects your AKool API token from exposure
Allows for proper request validation and rate limiting
Enables usage tracking and monitoring
Provides better control over session lifecycle
Prevents unauthorized access to the API
First, create a session to obtain Agora credentials. While both browser and backend implementations are possible, the backend approach is recommended for security:
Copy
// Recommended: Backend Implementationasync function createSessionFromBackend(): Promise<Session> { // Your backend endpoint that securely wraps the AKool API const response = await fetch('https://your-backend.com/api/avatar/create-session', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ avatarId: "dvp_Tristan_cloth2_1080P", duration: 600, }) }); if (!response.ok) { throw new Error('Failed to create session through backend'); } return response.json();}// Not Recommended: Direct Browser Implementation// Only use this for development/testing purposesasync function createSessionInBrowser(): Promise<Session> { const response = await fetch('https://openapi.akool.com/api/open/v4/liveAvatar/session/create', { method: 'POST', headers: { 'Authorization': 'Bearer YOUR_TOKEN', // Security risk: Token exposed in browser 'Content-Type': 'application/json' }, body: JSON.stringify({ avatar_id: "dvp_Tristan_cloth2_1080P", duration: 600, }) }); if (!response.ok) { throw new Error(`Failed to create session: ${response.status} ${response.statusText}`); } const res = await response.json(); return res.data;}
In real-world scenarios, the message size is limited to 1KB and the message frequency is limited to 6KB per second, so we need to split the message into chunks and send them separately.
Copy
export async function sendMessageToAvatar(client: RTCClient, messageId: string, content: string) { const MAX_ENCODED_SIZE = 950; const BYTES_PER_SECOND = 6000; // Improved message encoder with proper typing const encodeMessage = (text: string, idx: number, fin: boolean): Uint8Array => { const message: StreamMessage = { v: 2, type: 'chat', mid: messageId, idx, fin, pld: { text, }, }; return new TextEncoder().encode(JSON.stringify(message)); }; // Validate inputs if (!content) { throw new Error('Content cannot be empty'); } // Calculate maximum content length const baseEncoded = encodeMessage('', 0, false); const maxQuestionLength = Math.floor((MAX_ENCODED_SIZE - baseEncoded.length) / 4); // Split message into chunks const chunks: string[] = []; let remainingMessage = content; let chunkIndex = 0; while (remainingMessage.length > 0) { let chunk = remainingMessage.slice(0, maxQuestionLength); let encoded = encodeMessage(chunk, chunkIndex, false); // Binary search for optimal chunk size if needed while (encoded.length > MAX_ENCODED_SIZE && chunk.length > 1) { chunk = chunk.slice(0, Math.ceil(chunk.length / 2)); encoded = encodeMessage(chunk, chunkIndex, false); } if (encoded.length > MAX_ENCODED_SIZE) { throw new Error('Message encoding failed: content too large for chunking'); } chunks.push(chunk); remainingMessage = remainingMessage.slice(chunk.length); chunkIndex++; } log(`Splitting message into ${chunks.length} chunks`); // Send chunks with rate limiting for (let i = 0; i < chunks.length; i++) { const isLastChunk = i === chunks.length - 1; const encodedChunk = encodeMessage(chunks[i], i, isLastChunk); const chunkSize = encodedChunk.length; const minimumTimeMs = Math.ceil((1000 * chunkSize) / BYTES_PER_SECOND); const startTime = Date.now(); log(`Sending chunk ${i + 1}/${chunks.length}, size=${chunkSize} bytes`); try { await client.sendStreamMessage(encodedChunk, false); } catch (error: unknown) { throw new Error(`Failed to send chunk ${i + 1}: ${error instanceof Error ? error.message : 'Unknown error'}`); } if (!isLastChunk) { const elapsedMs = Date.now() - startTime; const remainingDelay = Math.max(0, minimumTimeMs - elapsedMs); if (remainingDelay > 0) { await new Promise((resolve) => setTimeout(resolve, remainingDelay)); } } }}
8. Video Interaction With The Avatar (coming soon)
Video interaction is currently under development and will be available in a future release. The following implementation details are provided as a reference for upcoming features.
To enable video interaction with the avatar, you’ll need to publish your local video stream:
Copy
// Note: This is a preview of upcoming functionalityasync function publishVideo(client: IAgoraRTCClient) { // Create a camera video track const videoTrack = await AgoraRTC.createCameraVideoTrack(); try { // Publish the video track to the channel await client.publish(videoTrack); console.log("Video publishing successful"); return videoTrack; } catch (error) { console.error("Error publishing video:", error); throw error; }}// Example usage with video controls (Preview of upcoming features)async function setupVideoInteraction(client: IAgoraRTCClient) { let videoTrack; // Start video async function startVideo() { try { videoTrack = await publishVideo(client); // Play the local video in a specific HTML element videoTrack.play('local-video-container'); } catch (error) { console.error("Failed to start video:", error); } } // Stop video async function stopVideo() { if (videoTrack) { // Stop and close the video track videoTrack.stop(); videoTrack.close(); await client.unpublish(videoTrack); videoTrack = null; } } // Enable/disable video function toggleVideo(enabled: boolean) { if (videoTrack) { videoTrack.setEnabled(enabled); } } // Switch camera (if multiple cameras are available) async function switchCamera(deviceId: string) { if (videoTrack) { await videoTrack.setDevice(deviceId); } } return { startVideo, stopVideo, toggleVideo, switchCamera };}
The upcoming video features will include:
Two-way video communication
Camera switching capabilities
Video quality controls
Integration with existing audio features
Stay tuned for updates on when video interaction becomes available.
async function initializeStreamingAvatar() { let client; try { // Create session and get credentials const session = await createSession(); const { credentials } = session; // Initialize Agora client client = await initializeAgoraClient(credentials); // Subscribe to the audio and video stream of the avatar await subscribeToAvatarStream(client); // Set up message handlers setupMessageHandlers(client); // Example usage await sendMessageToAvatar(client, "Hello!"); // Or use your own LLM service await sendMessageToAvatarWithLLM(client, "Hello!"); // Example of voice interaction await interruptAvatar(client); // Example of Audio Interaction With The Avatar await setupAudioInteraction(client); // Example of changing avatar parameters await setAvatarParams(client, { lang: "en", vid: "new_voice_id" }); return client; } catch (error) { console.error('Error initializing streaming avatar:', error); if (client) { await cleanup(client, session._id); } throw error; }}