Documentation Index Fetch the complete documentation index at: https://docs.akool.com/llms.txt
Use this file to discover all available pages before exploring further.
The resources (image, video, voice) generated by our API are valid for 7 days.
Please save the relevant resources as soon as possible to prevent expiration.
Knowledge Base Integration: You can enhance your streaming avatar with contextual AI responses by integrating a Knowledge Base . When creating a session, provide a knowledge_id parameter to enable the AI to use documents and URLs from your knowledge base for more accurate and relevant responses.
API Endpoints
Avatar Management
Session Management
Live Avatar Stream Message
IAgoraRTCClient . on ( event : "stream-message" , listener : ( uid : UID , pld : Uint8Array ) => void )
IAgoraRTCClient . sendStreamMessage ( msg : Uint8Array | string , flag : boolean ): Promise < void > ;
Send Data
Chat Type Parameters
Parameter Type Required Value Description v Number Yes 2 Version of the message type String Yes chat Message type for chat interactions mid String Yes Unique message identifier for conversation tracking idx Number Yes Sequential index of the message, start from 0 fin Boolean Yes Indicates if this is the final part of the message pld Object Yes Container for message payload pld.text String Yes Text content to send to avatar (e.g. “Hello”)
Command Type Parameters
Parameter Type Required Value Description v Number Yes 2 Protocol version number type String Yes command Specifies this is a system command message mid String Yes Unique ID to track and correlate command messages pld Object Yes Contains the command details and parameters pld.cmd String Yes Command action to execute. Valid values: “set-params” (update avatar settings), “interrupt” (stop current avatar response) pld.data Object No Parameters for the command (required for “set-params” ) pld.data.vid String No Deprecated. Use pld.data.vparams.vid instead. Voice ID to change avatar’s voice. Only used with “set-params” . Get valid IDs from Voice List API pld.data.vurl String No Deprecated. Use pld.data.vparams.vurl instead. Custom voice model URL. Only used with “set-params” . Get valid URLs from Voice List API pld.data.lang String No Language code for avatar responses (e.g. “en”, “es”). Only used with “set-params” . Get valid codes from Language List API pld.data.mode Number No Avatar interaction style. Only used with “set-params” . “1” = Retelling (avatar repeats content), “2” = Dialogue (avatar engages in conversation) pld.data.bgurl String No URL of background image/video for avatar scene. Only used with “set-params” pld.data.vparams Object No Voice parameters to use for the session.
Voice Parameters
Parameter Type Required Value Description vid String No Voice ID to change avatar’s voice. Only used with “set-params” . Get valid IDs from Voice List API vurl String No Custom voice model URL. Only used with “set-params” . Get valid URLs from Voice List API speed double No 1 Controls the speed of the generated speech. Values range from 0.8 to 1.2, with 1.0 being the default speed. stt_language String No Language code for speech-to-text recognition to improve accuracy by using language-specific models pron_map Object No Pronunciation mapping for custom words. Example: “pron_map”: { “akool” : “ai ku er” } stt_type String No Speech-to-text type. "openai_realtime" = OpenAI Realtime turn_detection Object No Turn detection configuration.
Turn Detection Configuration
Parameter Type Required Value Description type String No ”server_vad” Turn detection type. "server_vad" = Server VAD, "semantic_vad" = Semantic VAD threshold Number No 0.5 Activation threshold (0 to 1). A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments. Available when type is "server_vad". prefix_padding_ms Number No 300 Amount of audio (in milliseconds) to include before the VAD detected speech. Available when type is "server_vad". silence_duration_ms Number No 500 Duration of silence (in milliseconds) to detect speech stop. With shorter values turns will be detected more quickly. Available when type is "server_vad".
JSON Example
Chat Request
Set Avatar Params
Interrupt Response
Set Avatar Actions
{
"v" : 2 ,
"type" : "chat" ,
"mid" : "msg-1723629433573" ,
"idx" : 0 ,
"fin" : true ,
"pld" : {
"text" : "Hello"
},
}
Receive Data
Chat Type Parameters
Parameter Type Value Description v Number 2 Version of the message type String chat Message type for chat interactions mid String Unique message identifier for tracking conversation flow idx Number Sequential index of the message part fin Boolean Indicates if this is the final part of the response pld Object Container for message payload pld.from String ”bot” or “user” Source of the message - “bot” for avatar responses, “user” for speech recognition input pld.text String Text content of the message
Command Type Parameters
Parameter Type Value Description v Number 2 Version of the message type String command Message type for system commands mid String Unique identifier for tracking related messages in a conversation pld Object Container for command payload pld.cmd String ”set-params”, “interrupt” Command to execute: “set-params” to update avatar settings, “interrupt” to stop current response pld.code Number 1000 Response code from the server, 1000 indicates success pld.msg String Response message from the server
JSON Example
Chat Response
Command Response
{
"v" : 2 ,
"type" : "chat" ,
"mid" : "msg-1723629433573" ,
"idx" : 0 ,
"fin" : true ,
"pld" : {
"from" : "bot" ,
"text" : "Hello! How can I assist you today? "
}
}
Typescript Example
Create Client
Send Message
Set Avatar Params
Interrupt Response
const client : IAgoraRTCClient = AgoraRTC . createClient ({
mode: 'rtc' ,
codec: 'vp8' ,
});
client . join ( agora_app_id , agora_channel , agora_token , agora_uid );
client . on ( 'stream-message' , ( message : Uint8Array | string ) => {
console . log ( 'received: %s' , message );
});
Integrating Your Own LLM Service
Before dispatching a message to the WebSocket, consider executing an HTTP request to your LLM service.
const client : IAgoraRTCClient = AgoraRTC . createClient ({
mode: 'rtc' ,
codec: 'vp8' ,
});
client . join ( agora_app_id , agora_channel , agora_token , agora_uid );
client . on ( 'stream-message' , ( message : Uint8Array | string ) => {
console . log ( 'received: %s' , message );
});
let inputMessage = 'hello' ;
try {
const response = await fetch ( 'https://your-backend-host/api/llm/answer' , {
method: "POST" ,
headers: {
"Content-Type" : "application/json" ,
},
body: JSON . stringify ({
question: inputMessage ,
}),
});
if ( response . ok ) {
const result = await response . json ();
inputMessage = result . answer ;
} else {
console . error ( "Failed to fetch from backend" , response . statusText );
}
} catch ( error ) {
console . error ( "Error during fetch operation" , error );
}
const message = {
v: 2 ,
type: "chat" ,
mid: "msg-1723629433573" ,
idx: 0 ,
fin: true ,
pld: {
text: inputMessage ,
},
};
client . sendStreamMessage ( JSON . stringify ( message ), false );
Response Code Description
Please note that if the value of the response code is not equal to 1000, the request
is failed or wrong
Parameter Value Description code 1000 Success code 1003 Parameter error or Parameter can not beempty code 1008 The content you get does not exist code 1009 Youdo not have permission to operate code 1101 Invalid authorization or Therequest token has expired code 1102 Authorization cannot be empty code 1200 The account has been banned code 1201 create audio error, pleasetry again later code 1202 The same video cannot be translated lipSync inthe same language more than 1 times code 1203 video should be with audio code 1204 Your video duration is exceed 60s! code 1205 Create videoerror, please try again later code 1207 The video you are using exceeds thesize limit allowed by the system by 300M code 1209 Please upload a videoin another encoding format code 1210 The video you are using exceeds thevalue allowed by the system by 60fp code 1211 Create lipsync error, pleasetry again later