Streaming Avatar Integration Guide

🎉 New Feature Available! Video interaction with streaming avatars is now live! You can now enable two-way video communication with your avatars, including camera switching capabilities and video quality controls. Check out the Video Interaction section for implementation details.

Overview

The Streaming Avatar feature allows you to create interactive, real-time avatar experiences in your application. This guide provides a comprehensive walkthrough of integrating streaming avatars using Agora SDK, LiveKit SDK, or TRTC SDK, including:

Setting up real-time communication channels
Handling avatar interactions and responses
Managing audio streams
Implementing cleanup procedures
Optional LLM service integration

Choose between three reliable WebRTC-based SDKs for low-latency streaming with our avatar service. The code examples in this guide use synchronized tabs - select your preferred provider (Agora, LiveKit, or TRTC) in any code block, and all code examples on the page will automatically switch to match your selection.

Prerequisites

1. Install the SDK

npm install agora-rtc-sdk-ng
# or
yarn add agora-rtc-sdk-ng

2. Import the required dependencies

import AgoraRTC, { IAgoraRTCClient } from "agora-rtc-sdk-ng";

3. Understanding Data Channel Limitations

Agora SDK - Hidden API and Limitations

Agora SDK’s sendStreamMessage is not exposed in the public API, so we need to add it manually to the type definitions:

interface RTCClient extends IAgoraRTCClient {
  sendStreamMessage(msg: Uint8Array | string, flag: boolean): Promise<void>;
}

Important: The Agora SDK has the following limitations (reference):

Maximum message size: 1 KB (1,024 bytes)
Maximum message frequency: 6 KB per second

LiveKit SDK - Data Channel Limitations

LiveKit uses WebRTC data channels for message communication with more generous limits:

Important: LiveKit data channel has the following limitations:

Reliable mode: Maximum message size of 15 KiB (15,360 bytes) per message
Lossy mode: Recommended to keep messages under 1,300 bytes
For this integration, we use reliable mode which provides significantly larger message capacity than Agora

For more details, refer to LiveKit’s Data Channel Documentation.

TRTC SDK - Data Channel Limitations

TRTC uses custom messages for communication with specific limitations:

Important: TRTC data channel has the following limitations:

Custom messages: Maximum message size of 1 KB (1,024 bytes) per message
Rate limit: 30 calls per second, 8 KB/s for custom messages
Messages are sent reliably and in order by default

For more details, refer to TRTC’s Custom Message Documentation.

Integration Flow

The integration follows the same pattern regardless of which SDK you choose (Agora, LiveKit, or TRTC):

Key Implementation Steps

1. Create a Live Avatar Session

Security Recommendation: We strongly recommend implementing session management through your backend server rather than directly in the browser. This approach:

Protects your AKool API token from exposure
Allows for proper request validation and rate limiting
Enables usage tracking and monitoring
Provides better control over session lifecycle
Prevents unauthorized access to the API

First, create a session to obtain Agora credentials. While both browser and backend implementations are possible, the backend approach is recommended for security:

// Recommended: Backend Implementation
async function createSessionFromBackend(): Promise<Session> {
  // Your backend endpoint that securely wraps the AKool API
  const response = await fetch('https://your-backend.com/api/avatar/create-session', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      avatarId: "dvp_Tristan_cloth2_1080P",
      duration: 600,
    })
  });

  if (!response.ok) {
    throw new Error('Failed to create session through backend');
  }

  return response.json();
}

// Not Recommended: Direct Browser Implementation
// Only use this for development/testing purposes
async function createSessionInBrowser(): Promise<Session> {
  const response = await fetch('https://openapi.akool.com/api/open/v4/liveAvatar/session/create', {
    method: 'POST',
    headers: {
      'x-api-key':'{{API Key}}', // Security risk: Token exposed in browser
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      avatar_id: "dvp_Tristan_cloth2_1080P",
      duration: 600,
    })
  });

  if (!response.ok) {
    throw new Error(`Failed to create session: ${response.status} ${response.statusText}`);
  }

  const res = await response.json();
  return res.data;
}

2. Initialize the Client/Room

Create and configure the client or room:

async function initializeAgoraClient(credentials) {
  const client = AgoraRTC.createClient({
    mode: 'rtc',
    codec: 'vp8'
  });

  try {
    await client.join(
      credentials.agora_app_id,
      credentials.agora_channel,
      credentials.agora_token,
      credentials.agora_uid
    );
    
    return client;
  } catch (error) {
    console.error('Error joining channel:', error);
    throw error;
  }
}

Subscribe to the audio and video stream of the avatar:

async function subscribeToAvatarStream(client: IAgoraRTCClient) { 
  const onUserPublish = async (user: IAgoraRTCRemoteUser, mediaType: 'video' | 'audio') => {
    const remoteTrack = await client.subscribe(user, mediaType);
    remoteTrack.play();
  };

  const onUserUnpublish = async (user: IAgoraRTCRemoteUser, mediaType: 'video' | 'audio') => {
    await client.unsubscribe(user, mediaType);
  };

  client.on('user-published', onUserPublish);
  client.on('user-unpublished', onUserUnpublish);
}

4. Set Up Message Handling

Configure message listeners to handle avatar responses:

function setupMessageHandlers(client: IAgoraRTCClient) {
  let answer = '';
  client.on('stream-message', (uid, message) => {
    try {
      const parsedMessage = JSON.parse(message);
      
      if (parsedMessage.type === 'chat') {
        const payload = parsedMessage.pld;
        
        if (payload.from === 'bot') {
          if (!payload.fin) {
            answer += payload.text;
          } else {
            console.log('Avatar response:', answer);
            answer = '';
          }
        } else if (payload.from === 'user') {
          console.log('User message:', payload.text);
        }
      } else if (parsedMessage.type === 'command') {
        if (parsedMessage.pld.code !== 1000) {
          console.error('Command failed:', parsedMessage.pld.msg);
        }
      }
    } catch (error) {
      console.error('Error parsing message:', error);
    }
  });
}

5. Send Messages to Avatar

Implement functions to interact with the avatar:

async function sendMessageToAvatar(client: IAgoraRTCClient, question: string) {
  const message = {
    v: 2,
    type: "chat",
    mid: `msg-${Date.now()}`,
    idx: 0,
    fin: true,
    pld: {
      text: question,
    }
  };

  try {
    await client.sendStreamMessage(JSON.stringify(message), false);
  } catch (error) {
    console.error('Error sending message:', error);
    throw error;
  }
}

Agora: Handling Large Messages with Chunking

In real-world scenarios with Agora, the message size is limited to 1KB and the message frequency is limited to 6KB per second, so we need to split large messages into chunks and send them separately:

export async function sendMessageToAvatar(client: RTCClient, messageId: string, content: string) {
  const MAX_ENCODED_SIZE = 950;
  const BYTES_PER_SECOND = 6000;

  // Improved message encoder with proper typing
  const encodeMessage = (text: string, idx: number, fin: boolean): Uint8Array => {
    const message: StreamMessage = {
      v: 2,
      type: 'chat',
      mid: messageId,
      idx,
      fin,
      pld: {
        text,
      },
    };
    return new TextEncoder().encode(JSON.stringify(message));
  };

  // Validate inputs
  if (!content) {
    throw new Error('Content cannot be empty');
  }

  // Calculate maximum content length
  const baseEncoded = encodeMessage('', 0, false);
  const maxQuestionLength = Math.floor((MAX_ENCODED_SIZE - baseEncoded.length) / 4);

  // Split message into chunks
  const chunks: string[] = [];
  let remainingMessage = content;
  let chunkIndex = 0;

  while (remainingMessage.length > 0) {
    let chunk = remainingMessage.slice(0, maxQuestionLength);
    let encoded = encodeMessage(chunk, chunkIndex, false);

    // Binary search for optimal chunk size if needed
    while (encoded.length > MAX_ENCODED_SIZE && chunk.length > 1) {
      chunk = chunk.slice(0, Math.ceil(chunk.length / 2));
      encoded = encodeMessage(chunk, chunkIndex, false);
    }

    if (encoded.length > MAX_ENCODED_SIZE) {
      throw new Error('Message encoding failed: content too large for chunking');
    }

    chunks.push(chunk);
    remainingMessage = remainingMessage.slice(chunk.length);
    chunkIndex++;
  }

  log(`Splitting message into ${chunks.length} chunks`);

  // Send chunks with rate limiting
  for (let i = 0; i < chunks.length; i++) {
    const isLastChunk = i === chunks.length - 1;
    const encodedChunk = encodeMessage(chunks[i], i, isLastChunk);
    const chunkSize = encodedChunk.length;

    const minimumTimeMs = Math.ceil((1000 * chunkSize) / BYTES_PER_SECOND);
    const startTime = Date.now();

    log(`Sending chunk ${i + 1}/${chunks.length}, size=${chunkSize} bytes`);

    try {
      await client.sendStreamMessage(encodedChunk, false);
    } catch (error: unknown) {
      throw new Error(`Failed to send chunk ${i + 1}: ${error instanceof Error ? error.message : 'Unknown error'}`);
    }

    if (!isLastChunk) {
      const elapsedMs = Date.now() - startTime;
      const remainingDelay = Math.max(0, minimumTimeMs - elapsedMs);
      if (remainingDelay > 0) {
        await new Promise((resolve) => setTimeout(resolve, remainingDelay));
      }
    }
  }
}

LiveKit Note: Unlike Agora’s 1KB limit, LiveKit supports messages up to 15 KiB in reliable mode, so chunking is generally not needed for typical conversational messages. For very large messages (over 15 KiB), you would need to implement similar chunking logic.TRTC Note: TRTC has the same 1KB message size limit as Agora, so you would need to implement similar chunking logic for large messages. The chunking approach shown above for Agora can be adapted for TRTC by replacing client.sendStreamMessage() with client.sendCustomMessage().

6. Control Avatar Parameters

Implement functions to control avatar settings:

async function setAvatarParams(client: IAgoraRTCClient, params: {
  vid?: string;
  lang?: string;
  mode?: number;
  bgurl?: string;
}) {
  const message = {
    v: 2,
    type: 'command',
    mid: `msg-${Date.now()}`,
    pld: {
      cmd: 'set-params',
      data: params
    }
  };

  await client.sendStreamMessage(JSON.stringify(message), false);
}

async function interruptAvatar(client: IAgoraRTCClient) {
  const message = {
    v: 2,
    type: 'command',
    mid: `msg-${Date.now()}`,
    pld: {
      cmd: 'interrupt'
    }
  };

  await client.sendStreamMessage(JSON.stringify(message), false);
}

7. Audio Interaction With The Avatar

To enable audio interaction with the avatar, you’ll need to publish your local audio stream:

async function publishAudio(client: IAgoraRTCClient) {
  // Create a microphone audio track
  const audioTrack = await AgoraRTC.createMicrophoneAudioTrack();
  
  try {
    // Publish the audio track to the channel
    await client.publish(audioTrack);
    console.log("Audio publishing successful");
    
    return audioTrack;
  } catch (error) {
    console.error("Error publishing audio:", error);
    throw error;
  }
}

// Example usage with audio controls
async function setupAudioInteraction(client: IAgoraRTCClient) {
  let audioTrack;
  
  // Start audio
  async function startAudio() {
    try {
      audioTrack = await publishAudio(client);
    } catch (error) {
      console.error("Failed to start audio:", error);
    }
  }

  // Stop audio
  async function stopAudio() {
    if (audioTrack) {
      // Stop and close the audio track
      audioTrack.stop();
      audioTrack.close();
      await client.unpublish(audioTrack);
      audioTrack = null;
    }
  }

  // Mute/unmute audio
  function toggleAudio(muted: boolean) {
    if (audioTrack) {
      if (muted) {
        audioTrack.setEnabled(false);
      } else {
        audioTrack.setEnabled(true);
      }
    }
  }

  return {
    startAudio,
    stopAudio,
    toggleAudio
  };
}

Now you can integrate audio controls into your application:

async function initializeWithAudio() {
  try {
    // Initialize avatar
    const client = await initializeStreamingAvatar();
    
    // Setup audio controls
    const audioControls = await setupAudioInteraction(client);
    
    // Start audio when needed
    await audioControls.startAudio();
    
    // Example of muting/unmuting
    audioControls.toggleAudio(true); // mute
    audioControls.toggleAudio(false); // unmute
    
    // Stop audio when done
    await audioControls.stopAudio();
    
  } catch (error) {
    console.error("Error initializing with audio:", error);
  }
}

Additional Resources:

8. Integrating your own LLM service (optional)

You can integrate your own LLM service to process messages before sending them to the avatar. Here’s how to do it:

// Define the LLM service response interface
interface LLMResponse {
  answer: string;
}

// Set the avatar to retelling mode
await setAvatarParams(client, {
  mode: 1,
});

// Create a wrapper for your LLM service
async function processWithLLM(question: string): Promise<LLMResponse> {
  try {
    const response = await fetch('YOUR_LLM_SERVICE_ENDPOINT', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        question,
      })
    });

    if (!response.ok) {
      throw new Error('LLM service request failed');
    }

    return await response.json();
  } catch (error) {
    console.error('Error processing with LLM:', error);
    throw error;
  }
}

async function sendMessageToAvatarWithLLM(
  client: IAgoraRTCClient, 
  question: string
) {
  try {
    // Process the question with your LLM service
    const llmResponse = await processWithLLM(question);

    // Prepare the message with LLM response
    const message = {
      v: 2,
      type: "chat",
      mid: `msg-${Date.now()}`,
      idx: 0,
      fin: true,
      pld: {
        text: llmResponse.answer // Use the LLM-processed response
      }
    };

    // Send the processed message to the avatar
    await client.sendStreamMessage(JSON.stringify(message), false);

  } catch (error) {
    console.error('Error in LLM-enhanced message sending:', error);
    throw error;
  }
}

Remember to:

Implement proper rate limiting for your LLM service
Handle token limits appropriately
Implement retry logic for failed LLM requests
Consider implementing streaming responses if your LLM service supports it
Cache common responses when appropriate

9. Cleanup

Cleanup can also be performed either directly or through your backend:

// Browser Implementation
async function cleanupInBrowser(client: IAgoraRTCClient, sessionId: string) {
  await fetch('https://openapi.akool.com/api/open/v4/liveAvatar/session/close', {
    method: 'POST',
    headers: {
      'x-api-key':'{{API Key}}'
    },
    body: JSON.stringify({
      id: sessionId
    })
  });

  await performClientCleanup(client);
}

// Backend Implementation
async function cleanupFromBackend(client: IAgoraRTCClient, sessionId: string) {
  await fetch('https://your-backend.com/api/avatar/close-session', {
    method: 'POST',
    body: JSON.stringify({
      sessionId
    })
  });

  await performClientCleanup(client);
}

// Shared cleanup logic
async function performClientCleanup(client: IAgoraRTCClient) {
  // Remove event listeners
  client.removeAllListeners('user-published');
  client.removeAllListeners('user-unpublished');
  client.removeAllListeners('stream-message');

  // Stop audio/video and unpublish if they're still running
  if (audioControls) {
    await audioControls.stopAudio();
  }
  if (videoControls) {
    await videoControls.stopVideo();
  }
  
  // Leave the Agora channel
  await client.leave();
}

When implementing through your backend, make sure to:

Securely store your AKool API token
Implement proper authentication and rate limiting
Handle errors appropriately
Consider implementing session management and monitoring

10. Putting It All Together

Here’s how to use all the components together:

async function initializeStreamingAvatar() {
  let client;
  let session;
  try {
    // Create session and get credentials
    session = await createSession();
    const { credentials } = session;
    
    // Initialize Agora client
    client = await initializeAgoraClient(credentials);

    // Subscribe to the audio and video stream of the avatar
    await subscribeToAvatarStream(client);
    
    // Set up message handlers
    setupMessageHandlers(client);
    
    // Example usage
    await sendMessageToAvatar(client, "Hello!");

    // Or use your own LLM service
    await sendMessageToAvatarWithLLM(client, "Hello!");

    // Example of voice interaction
    await interruptAvatar(client);

    // Example of Audio Interaction With The Avatar
    await setupAudioInteraction(client);

    // Example of changing avatar parameters
    await setAvatarParams(client, {
      lang: "en",
      vid: "new_voice_id"
    });
    
    return client;
  } catch (error) {
    console.error('Error initializing streaming avatar:', error);
    if (client && session) {
      await cleanup(client, session._id);
    }
    throw error;
  }
}

Authentication

Face Swap

Streaming Avatar

Talking Photo

Video Translation

Face Detection

AI Tools Suite

Overview

Prerequisites

1. Install the SDK

2. Import the required dependencies

3. Understanding Data Channel Limitations

Integration Flow

Key Implementation Steps

1. Create a Live Avatar Session

2. Initialize the Client/Room

4. Set Up Message Handling

5. Send Messages to Avatar

6. Control Avatar Parameters

7. Audio Interaction With The Avatar

8. Integrating your own LLM service (optional)

9. Cleanup

10. Putting It All Together

Additional Resources

Agora SDK Resources

LiveKit SDK Resources

TRTC SDK Resources

AKool API Resources

Authentication

Face Swap

Streaming Avatar

Talking Photo

Video Translation

Face Detection

AI Tools Suite

​Overview

​Prerequisites

​1. Install the SDK

​2. Import the required dependencies

​3. Understanding Data Channel Limitations

​Integration Flow

​Key Implementation Steps

​1. Create a Live Avatar Session

​2. Initialize the Client/Room

​3. Subscribe to Audio and Video Stream

​4. Set Up Message Handling

​5. Send Messages to Avatar

​6. Control Avatar Parameters

​7. Audio Interaction With The Avatar

​8. Integrating your own LLM service (optional)

​9. Cleanup

​10. Putting It All Together

​Additional Resources

​Agora SDK Resources

​LiveKit SDK Resources

​TRTC SDK Resources

​AKool API Resources

Overview

Prerequisites

1. Install the SDK

2. Import the required dependencies

3. Understanding Data Channel Limitations

Integration Flow

Key Implementation Steps

1. Create a Live Avatar Session

2. Initialize the Client/Room

3. Subscribe to Audio and Video Stream

4. Set Up Message Handling

5. Send Messages to Avatar

6. Control Avatar Parameters

7. Audio Interaction With The Avatar

8. Integrating your own LLM service (optional)

9. Cleanup

10. Putting It All Together

Additional Resources

Agora SDK Resources

LiveKit SDK Resources

TRTC SDK Resources

AKool API Resources