Both the avatar_id and voice_id can be easily obtained by copying them directly from the web interface. You can also create and manage your streaming avatars using our intuitive web platform.Create and manage your avatars at: https://akool.com/apps/upload/avatar?from=%2Fapps%2Fstreaming-avatar%2Fedit
The resources (image, video, voice) generated by our API are valid for 7 days.
Please save the relevant resources as soon as possible to prevent expiration.
To experience our live avatar streaming feature in action, explore our demo built on the Agora streaming service: AKool Streaming Avatar React Demo.
Knowledge Base Integration: You can enhance your streaming avatar with contextual AI responses by integrating a Knowledge Base. When creating a session, provide a 
knowledge_id parameter to enable the AI to use documents and URLs from your knowledge base for more accurate and relevant responses.API Endpoints
Avatar Management
- Upload Streaming Avatar - Create a new streaming avatar from a video URL
- Get Avatar List - Retrieve a list of all streaming avatars
- Get Avatar Detail - Get detailed information about a specific avatar
Session Management
- Create Session - Create a new streaming avatar session
- Get Session Detail - Retrieve detailed information about a specific session
- Close Session - Close an active streaming avatar session
- Get Session List - Retrieve a list of all streaming avatar sessions
Live Avatar Stream Message
| Parameter | Type | Required | Value | Description | 
|---|---|---|---|---|
| v | Number | Yes | 2 | Version of the message | 
| type | String | Yes | chat | Message type for chat interactions | 
| mid | String | Yes | Unique message identifier for conversation tracking | |
| idx | Number | Yes | Sequential index of the message, start from 0 | |
| fin | Boolean | Yes | Indicates if this is the final part of the message | |
| pld | Object | Yes | Container for message payload | |
| pld.text | String | Yes | Text content to send to avatar (e.g. “Hello”) | 
| Parameter | Type | Required | Value | Description | 
|---|---|---|---|---|
| v | Number | Yes | 2 | Protocol version number | 
| type | String | Yes | command | Specifies this is a system command message | 
| mid | String | Yes | Unique ID to track and correlate command messages | |
| pld | Object | Yes | Contains the command details and parameters | |
| pld.cmd | String | Yes | Command action to execute. Valid values: “set-params” (update avatar settings), “interrupt” (stop current avatar response) | |
| pld.data | Object | No | Parameters for the command (required for “set-params”) | |
| pld.data.vid | String | No | Deprecated. Use pld.data.vparams.vid instead. Voice ID to change avatar’s voice. Only used with “set-params”. Get valid IDs from Voice List API | |
| pld.data.vurl | String | No | Deprecated. Use pld.data.vparams.vurl instead. Custom voice model URL. Only used with “set-params”. Get valid URLs from Voice List API | |
| pld.data.lang | String | No | Language code for avatar responses (e.g. “en”, “es”). Only used with “set-params”. Get valid codes from Language List API | |
| pld.data.mode | Number | No | Avatar interaction style. Only used with “set-params”. “1” = Retelling (avatar repeats content), “2” = Dialogue (avatar engages in conversation) | |
| pld.data.bgurl | String | No | URL of background image/video for avatar scene. Only used with “set-params” | |
| pld.data.vparams | Object | No | Voice parameters to use for the session. | 
| Parameter | Type | Required | Value | Description | 
|---|---|---|---|---|
| vid | String | No | Voice ID to change avatar’s voice. Only used with “set-params”. Get valid IDs from Voice List API | |
| vurl | String | No | Custom voice model URL. Only used with “set-params”. Get valid URLs from Voice List API | |
| speed | double | No | 1 | Controls the speed of the generated speech. Values range from 0.8 to 1.2, with 1.0 being the default speed. | 
| pron_map | Object | No | Pronunciation mapping for custom words. Example: “pron_map”: { “akool” : “ai ku er” } | |
| stt_type | String | No | Speech-to-text type. "openai_realtime"= OpenAI Realtime | |
| turn_detection | Object | No | Turn detection configuration. | 
| Parameter | Type | Required | Value | Description | 
|---|---|---|---|---|
| type | String | No | ”server_vad” | Turn detection type. "server_vad"= Server VAD,"semantic_vad"= Semantic VAD | 
| threshold | Number | No | 0.5 | Activation threshold (0 to 1). A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments. Available when type is "server_vad". | 
| prefix_padding_ms | Number | No | 300 | Amount of audio (in milliseconds) to include before the VAD detected speech. Available when type is "server_vad". | 
| silence_duration_ms | Number | No | 500 | Duration of silence (in milliseconds) to detect speech stop. With shorter values turns will be detected more quickly. Available when type is "server_vad". | 
| Parameter | Type | Value | Description | 
|---|---|---|---|
| v | Number | 2 | Version of the message | 
| type | String | chat | Message type for chat interactions | 
| mid | String | Unique message identifier for tracking conversation flow | |
| idx | Number | Sequential index of the message part | |
| fin | Boolean | Indicates if this is the final part of the response | |
| pld | Object | Container for message payload | |
| pld.from | String | ”bot” or “user” | Source of the message - “bot” for avatar responses, “user” for speech recognition input | 
| pld.text | String | Text content of the message | 
| Parameter | Type | Value | Description | 
|---|---|---|---|
| v | Number | 2 | Version of the message | 
| type | String | command | Message type for system commands | 
| mid | String | Unique identifier for tracking related messages in a conversation | |
| pld | Object | Container for command payload | |
| pld.cmd | String | ”set-params”, “interrupt” | Command to execute: “set-params” to update avatar settings, “interrupt” to stop current response | 
| pld.code | Number | 1000 | Response code from the server, 1000 indicates success | 
| pld.msg | String | Response message from the server | 
Integrating Your Own LLM Service
Before dispatching a message to the WebSocket, consider executing an HTTP request to your LLM service.Response Code Description
Please note that if the value of the response code is not equal to 1000, the request
is failed or wrong
| Parameter | Value | Description | 
|---|---|---|
| code | 1000 | Success | 
| code | 1003 | Parameter error or Parameter can not beempty | 
| code | 1008 | The content you get does not exist | 
| code | 1009 | Youdo not have permission to operate | 
| code | 1101 | Invalid authorization or Therequest token has expired | 
| code | 1102 | Authorization cannot be empty | 
| code | 1200 | The account has been banned | 
| code | 1201 | create audio error, pleasetry again later | 
| code | 1202 | The same video cannot be translated lipSync inthe same language more than 1 times | 
| code | 1203 | video should be with audio | 
| code | 1204 | Your video duration is exceed 60s! | 
| code | 1205 | Create videoerror, please try again later | 
| code | 1207 | The video you are using exceeds thesize limit allowed by the system by 300M | 
| code | 1209 | Please upload a videoin another encoding format | 
| code | 1210 | The video you are using exceeds thevalue allowed by the system by 60fp | 
| code | 1211 | Create lipsync error, pleasetry again later |