Overview
The Streaming Avatar feature allows you to create interactive, real-time avatar experiences in your application. This guide provides a comprehensive walkthrough of integrating streaming avatars using the Agora SDK, including:
- Setting up real-time communication channels
- Handling avatar interactions and responses
- Managing audio streams
- Implementing cleanup procedures
- Optional LLM service integration
The integration uses Agora’s Real-Time Communication (RTC) SDK for reliable, low-latency streaming and our avatar service for generating responsive avatar behaviors.
Prerequisites
- Install the Agora SDK in your project:
npm install agora-rtc-sdk-ng
# or
yarn add agora-rtc-sdk-ng
- Import the required dependencies:
import AgoraRTC, { IAgoraRTCClient } from "agora-rtc-sdk-ng";
- Add the hidden API of Agora SDK
Agora SDK’s sendStreamMessage is not exposed, so we need to add it manually. And it has some limitations, so we need to handle it carefully.
We can infer from the doc that the message size is limited to 1KB and the message frequency is limited to 6KB per second.
The Agora SDK’s sendStreamMessage
method needs to be manually added to the type definitions:
interface RTCClient extends IAgoraRTCClient {
sendStreamMessage(msg: Uint8Array | string, flag: boolean): Promise<void>;
}
Important: The Agora SDK has the following limitations:
- Maximum message size: 1KB
- Maximum message frequency: 6KB per second
Integration Flow
Key Implementation Steps
1. Create a Live Avatar Session
Security Recommendation: We strongly recommend implementing session management through your backend server rather than directly in the browser. This approach:
- Protects your AKool API token from exposure
- Allows for proper request validation and rate limiting
- Enables usage tracking and monitoring
- Provides better control over session lifecycle
- Prevents unauthorized access to the API
First, create a session to obtain Agora credentials. While both browser and backend implementations are possible, the backend approach is recommended for security:
// Recommended: Backend Implementation
async function createSessionFromBackend(): Promise<Session> {
// Your backend endpoint that securely wraps the AKool API
const response = await fetch('https://your-backend.com/api/avatar/create-session', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
avatarId: "dvp_Tristan_cloth2_1080P",
duration: 600,
})
});
if (!response.ok) {
throw new Error('Failed to create session through backend');
}
return response.json();
}
// Not Recommended: Direct Browser Implementation
// Only use this for development/testing purposes
async function createSessionInBrowser(): Promise<Session> {
const response = await fetch('https://openapi.akool.com/api/open/v4/liveAvatar/session/create', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_TOKEN', // Security risk: Token exposed in browser
'Content-Type': 'application/json'
},
body: JSON.stringify({
avatar_id: "dvp_Tristan_cloth2_1080P",
duration: 600,
})
});
if (!response.ok) {
throw new Error(`Failed to create session: ${response.status} ${response.statusText}`);
}
const res = await response.json();
return res.data;
}
2. Initialize Agora Client
Create and configure the Agora client:
async function initializeAgoraClient(credentials) {
const client = AgoraRTC.createClient({
mode: 'rtc',
codec: 'vp8'
});
try {
await client.join(
credentials.agora_app_id,
credentials.agora_channel,
credentials.agora_token,
credentials.agora_uid
);
return client;
} catch (error) {
console.error('Error joining channel:', error);
throw error;
}
}
3. Subscribe Audio and Video Stream
Subscribe to the audio and video stream of the avatar:
async function subscribeToAvatarStream(client: IAgoraRTCClient) {
const onUserPublish = async (user: IAgoraRTCRemoteUser, mediaType: 'video' | 'audio') => {
const remoteTrack = await client.subscribe(user, mediaType);
remoteTrack.play();
};
const onUserUnpublish = async (user: IAgoraRTCRemoteUser, mediaType: 'video' | 'audio') => {
await client.unsubscribe(user, mediaType);
};
client.on('user-published', onUserPublish);
client.on('user-unpublished', onUserUnpublish);
}
4. Set Up Message Handling
Configure message listeners to handle avatar responses:
function setupMessageHandlers(client: IAgoraRTCClient) {
let answer = '';
client.on('stream-message', (uid, message) => {
try {
const parsedMessage = JSON.parse(message);
if (parsedMessage.type === 'chat') {
const payload = parsedMessage.pld;
if (payload.from === 'bot') {
if (!payload.fin) {
answer += payload.text;
} else {
console.log('Avatar response:', answer);
answer = '';
}
} else if (payload.from === 'user') {
console.log('User message:', payload.text);
}
} else if (parsedMessage.type === 'command') {
if (parsedMessage.pld.code !== 1000) {
console.error('Command failed:', parsedMessage.pld.msg);
}
}
} catch (error) {
console.error('Error parsing message:', error);
}
});
}
5. Send Messages to Avatar
Implement functions to interact with the avatar:
async function sendMessageToAvatar(client: IAgoraRTCClient, question: string) {
const message = {
v: 2,
type: "chat",
mid: `msg-${Date.now()}`,
idx: 0,
fin: true,
pld: {
text: question,
}
};
try {
await client.sendStreamMessage(JSON.stringify(message), false);
} catch (error) {
console.error('Error sending message:', error);
throw error;
}
}
In real-world scenarios, the message size is limited to 1KB and the message frequency is limited to 6KB per second, so we need to split the message into chunks and send them separately.
export async function sendMessageToAvatar(client: RTCClient, messageId: string, content: string) {
const MAX_ENCODED_SIZE = 950;
const BYTES_PER_SECOND = 6000;
// Improved message encoder with proper typing
const encodeMessage = (text: string, idx: number, fin: boolean): Uint8Array => {
const message: StreamMessage = {
v: 2,
type: 'chat',
mid: messageId,
idx,
fin,
pld: {
text,
},
};
return new TextEncoder().encode(JSON.stringify(message));
};
// Validate inputs
if (!content) {
throw new Error('Content cannot be empty');
}
// Calculate maximum content length
const baseEncoded = encodeMessage('', 0, false);
const maxQuestionLength = Math.floor((MAX_ENCODED_SIZE - baseEncoded.length) / 4);
// Split message into chunks
const chunks: string[] = [];
let remainingMessage = content;
let chunkIndex = 0;
while (remainingMessage.length > 0) {
let chunk = remainingMessage.slice(0, maxQuestionLength);
let encoded = encodeMessage(chunk, chunkIndex, false);
// Binary search for optimal chunk size if needed
while (encoded.length > MAX_ENCODED_SIZE && chunk.length > 1) {
chunk = chunk.slice(0, Math.ceil(chunk.length / 2));
encoded = encodeMessage(chunk, chunkIndex, false);
}
if (encoded.length > MAX_ENCODED_SIZE) {
throw new Error('Message encoding failed: content too large for chunking');
}
chunks.push(chunk);
remainingMessage = remainingMessage.slice(chunk.length);
chunkIndex++;
}
log(`Splitting message into ${chunks.length} chunks`);
// Send chunks with rate limiting
for (let i = 0; i < chunks.length; i++) {
const isLastChunk = i === chunks.length - 1;
const encodedChunk = encodeMessage(chunks[i], i, isLastChunk);
const chunkSize = encodedChunk.length;
const minimumTimeMs = Math.ceil((1000 * chunkSize) / BYTES_PER_SECOND);
const startTime = Date.now();
log(`Sending chunk ${i + 1}/${chunks.length}, size=${chunkSize} bytes`);
try {
await client.sendStreamMessage(encodedChunk, false);
} catch (error: unknown) {
throw new Error(`Failed to send chunk ${i + 1}: ${error instanceof Error ? error.message : 'Unknown error'}`);
}
if (!isLastChunk) {
const elapsedMs = Date.now() - startTime;
const remainingDelay = Math.max(0, minimumTimeMs - elapsedMs);
if (remainingDelay > 0) {
await new Promise((resolve) => setTimeout(resolve, remainingDelay));
}
}
}
}
6. Control Avatar Parameters
Implement functions to control avatar settings:
async function setAvatarParams(client: IAgoraRTCClient, params: {
vid?: string;
lang?: string;
mode?: number;
bgurl?: string;
}) {
const message = {
v: 2,
type: 'command',
mid: `msg-${Date.now()}`,
pld: {
cmd: 'set-params',
data: params
}
};
await client.sendStreamMessage(JSON.stringify(message), false);
}
async function interruptAvatar(client: IAgoraRTCClient) {
const message = {
v: 2,
type: 'command',
mid: `msg-${Date.now()}`,
pld: {
cmd: 'interrupt'
}
};
await client.sendStreamMessage(JSON.stringify(message), false);
}
7. Audio Interaction With The Avatar
To enable audio interaction with the avatar, you’ll need to publish your local audio stream:
async function publishAudio(client: IAgoraRTCClient) {
// Create a microphone audio track
const audioTrack = await AgoraRTC.createMicrophoneAudioTrack();
try {
// Publish the audio track to the channel
await client.publish(audioTrack);
console.log("Audio publishing successful");
return audioTrack;
} catch (error) {
console.error("Error publishing audio:", error);
throw error;
}
}
// Example usage with audio controls
async function setupAudioInteraction(client: IAgoraRTCClient) {
let audioTrack;
// Start audio
async function startAudio() {
try {
audioTrack = await publishAudio(client);
} catch (error) {
console.error("Failed to start audio:", error);
}
}
// Stop audio
async function stopAudio() {
if (audioTrack) {
// Stop and close the audio track
audioTrack.stop();
audioTrack.close();
await client.unpublish(audioTrack);
audioTrack = null;
}
}
// Mute/unmute audio
function toggleAudio(muted: boolean) {
if (audioTrack) {
if (muted) {
audioTrack.setEnabled(false);
} else {
audioTrack.setEnabled(true);
}
}
}
return {
startAudio,
stopAudio,
toggleAudio
};
}
Now you can integrate audio controls into your application:
async function initializeWithAudio() {
try {
// Initialize avatar
const client = await initializeStreamingAvatar();
// Setup audio controls
const audioControls = await setupAudioInteraction(client);
// Start audio when needed
await audioControls.startAudio();
// Example of muting/unmuting
audioControls.toggleAudio(true); // mute
audioControls.toggleAudio(false); // unmute
// Stop audio when done
await audioControls.stopAudio();
} catch (error) {
console.error("Error initializing with audio:", error);
}
}
For more details about Agora’s audio functionality, refer to the Agora Web SDK Documentation.
8. Video Interaction With The Avatar (coming soon)
Video interaction is currently under development and will be available in a future release. The following implementation details are provided as a reference for upcoming features.
To enable video interaction with the avatar, you’ll need to publish your local video stream:
// Note: This is a preview of upcoming functionality
async function publishVideo(client: IAgoraRTCClient) {
// Create a camera video track
const videoTrack = await AgoraRTC.createCameraVideoTrack();
try {
// Publish the video track to the channel
await client.publish(videoTrack);
console.log("Video publishing successful");
return videoTrack;
} catch (error) {
console.error("Error publishing video:", error);
throw error;
}
}
// Example usage with video controls (Preview of upcoming features)
async function setupVideoInteraction(client: IAgoraRTCClient) {
let videoTrack;
// Start video
async function startVideo() {
try {
videoTrack = await publishVideo(client);
// Play the local video in a specific HTML element
videoTrack.play('local-video-container');
} catch (error) {
console.error("Failed to start video:", error);
}
}
// Stop video
async function stopVideo() {
if (videoTrack) {
// Stop and close the video track
videoTrack.stop();
videoTrack.close();
await client.unpublish(videoTrack);
videoTrack = null;
}
}
// Enable/disable video
function toggleVideo(enabled: boolean) {
if (videoTrack) {
videoTrack.setEnabled(enabled);
}
}
// Switch camera (if multiple cameras are available)
async function switchCamera(deviceId: string) {
if (videoTrack) {
await videoTrack.setDevice(deviceId);
}
}
return {
startVideo,
stopVideo,
toggleVideo,
switchCamera
};
}
The upcoming video features will include:
- Two-way video communication
- Camera switching capabilities
- Video quality controls
- Integration with existing audio features
Stay tuned for updates on when video interaction becomes available.
9. Integrating your own LLM service (optional)
You can integrate your own LLM service to process messages before sending them to the avatar. Here’s how to do it:
// Define the LLM service response interface
interface LLMResponse {
answer: string;
}
// Set the avatar to retelling mode
await setAvatarParams(client, {
mode: 1,
});
// Create a wrapper for your LLM service
async function processWithLLM(question: string): Promise<LLMResponse> {
try {
const response = await fetch('YOUR_LLM_SERVICE_ENDPOINT', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
question,
})
});
if (!response.ok) {
throw new Error('LLM service request failed');
}
return await response.json();
} catch (error) {
console.error('Error processing with LLM:', error);
throw error;
}
}
async function sendMessageToAvatarWithLLM(
client: IAgoraRTCClient,
question: string
) {
try {
// Process the question with your LLM service
const llmResponse = await processWithLLM(question);
// Prepare the message with LLM response
const message = {
type: "chat",
mid: `msg-${Date.now()}`,
idx: 0,
fin: true,
pld: {
text: llmResponse.answer // Use the LLM-processed response
}
};
// Send the processed message to the avatar
await client.sendStreamMessage(JSON.stringify(message), false);
} catch (error) {
console.error('Error in LLM-enhanced message sending:', error);
throw error;
}
}
Remember to:
- Implement proper rate limiting for your LLM service
- Handle token limits appropriately
- Implement retry logic for failed LLM requests
- Consider implementing streaming responses if your LLM service supports it
- Cache common responses when appropriate
10. Cleanup
Cleanup can also be performed either directly or through your backend:
// Browser Implementation
async function cleanupInBrowser(client: IAgoraRTCClient, sessionId: string) {
await fetch('https://openapi.akool.com/api/open/v4/liveAvatar/session/close', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_TOKEN'
},
body: JSON.stringify({
id: sessionId
})
});
await performClientCleanup(client);
}
// Backend Implementation
async function cleanupFromBackend(client: IAgoraRTCClient, sessionId: string) {
await fetch('https://your-backend.com/api/avatar/close-session', {
method: 'POST',
body: JSON.stringify({
sessionId
})
});
await performClientCleanup(client);
}
// Shared cleanup logic
async function performClientCleanup(client: IAgoraRTCClient) {
// Remove event listeners
client.removeAllListeners('user-published');
client.removeAllListeners('user-unpublished');
client.removeAllListeners('stream-message');
// Stop audio/video and unpublish if they're still running
if (audioControls) {
await audioControls.stopAudio();
}
if (videoControls) {
await videoControls.stopVideo();
}
// Leave the Agora channel
await client.leave();
}
When implementing through your backend, make sure to:
- Securely store your AKool API token
- Implement proper authentication and rate limiting
- Handle errors appropriately
- Consider implementing session management and monitoring
11. Putting It All Together
Here’s how to use all the components together:
async function initializeStreamingAvatar() {
let client;
try {
// Create session and get credentials
const session = await createSession();
const { credentials } = session;
// Initialize Agora client
client = await initializeAgoraClient(credentials);
// Subscribe to the audio and video stream of the avatar
await subscribeToAvatarStream(client);
// Set up message handlers
setupMessageHandlers(client);
// Example usage
await sendMessageToAvatar(client, "Hello!");
// Or use your own LLM service
await sendMessageToAvatarWithLLM(client, "Hello!");
// Example of voice interaction
await interruptAvatar(client);
// Example of Audio Interaction With The Avatar
await setupAudioInteraction(client);
// Example of changing avatar parameters
await setAvatarParams(client, {
lang: "en",
vid: "new_voice_id"
});
return client;
} catch (error) {
console.error('Error initializing streaming avatar:', error);
if (client) {
await cleanup(client, session._id);
}
throw error;
}
}
Additional Resources