Detect faces in images and videos with high accuracy. Get bounding boxes, 6-point landmarks, cropped face images, and face tracking for video content.
API Endpoints
Face Detection Operations
- Detect Faces - Unified endpoint for face detection in images and videos (auto-detects media type)
- Analyze Frames - Multi-frame face analysis with person deduplication for face swap preparation
Getting Started
Basic Workflow
-
For Image Face Detection:
- Call the Detect Faces API with an image URL or base64-encoded image
- Only the
urlorimgparameter is required (no need fornum_frames) - Receive bounding boxes and 6-point landmarks for all detected faces
- Optionally get cropped face image URLs with
return_face_url=true - Use the landmark data for downstream tasks (e.g., face swap, face recognition)
-
For Video Face Detection:
- Call the Detect Faces API with a video URL
- Specify the
num_framesparameter to control how many frames to analyze (default: 5) - Get face tracking data across frames with removed face positions
Response Code Description
Error code 0 indicates success. Any non-zero error code indicates a failure.
Check the error_msg field for detailed error information.
| Code | Description |
|---|---|
| 0 | Success |
| 1 | Error - Check error_msg for details |
Features
Dual Input Modes
The API supports two ways to provide image input:- URL Mode: Provide a publicly accessible URL to an image or video
- Base64 Mode: Provide base64-encoded image data (with or without data URI prefix)
6-Point Facial Landmarks
The API detects 6 key facial landmarks for each face:- Left Eye - Center point of the left eye
- Right Eye - Center point of the right eye
- Nose Tip - Tip of the nose
- Mouth Center - Center point of the mouth (X-axis midpoint between mouth corners)
- Left Mouth Corner - Left corner of the mouth
- Right Mouth Corner - Right corner of the mouth
Cropped Face Images
Whenreturn_face_url=true, the API returns:
- face_urls: URLs to cropped face images stored in cloud storage
- crop_region: The region coordinates used for cropping
- crop_landmarks: Landmarks relative to the cropped image
Single Face Mode
Whensingle_face=true, the API returns only the largest face (by area) in each frame. This is useful for:
- Portrait photos where you only care about the main subject
- ID photos with a single person
- Reducing response size when multiple faces are detected
Face Tracking for Videos
For video content, the API provides advanced face tracking:- Persistent Face IDs - Tracks the same face across multiple frames
- Removed Faces - Identifies faces that were present in previous frames but are no longer visible
- Frame Timing - Provides timestamp information for each frame
Auto Media Type Detection
The API automatically detects whether the input is an image or video based on:- File extension (
.jpg,.png,.mp4,.mov, etc.) - Content-Type header from the URL
- Fallback to content analysis if needed
Best Practices
Image Requirements
- Quality: Use high-resolution images for better detection accuracy
- Face Visibility: Ensure faces are clearly visible and not obscured
- Lighting: Well-lit images produce better detection results
- Angle: Frontal or slight angle faces work best (±45 degrees)
- Size: Face size should be at least 80x80 pixels
Video Requirements
- Duration: Shorter videos process faster
- Frame Rate: Standard frame rates (24-30 fps) are optimal
- Resolution: 720p or higher recommended for best results
- Face Count: API can detect multiple faces per frame
- Encoding: Use standard encoding formats (H.264 recommended)
API Usage Tips
- Parameter Usage:
- For Images: Only
urlorimgparameter is required. Thenum_framesparameter is NOT needed. - For Videos: Both
urlandnum_framesparameters are recommended.
- For Images: Only
- Frame Selection (for videos only):
- Short videos (< 10s): 5-10 frames
- Medium videos (10-30s): 10-20 frames
- Long videos (> 30s): 20-50 frames
- URL Accessibility: Ensure the media URL is publicly accessible
- Supported Formats:
- Images: JPG, JPEG, PNG, BMP, WEBP
- Videos: MP4, MOV, AVI, WEBM
Understanding the Response
Response Structure
Field Descriptions
- error_code: Status code (0 = success)
- error_msg: Status message or error description
- faces_obj: Dictionary keyed by frame index (as string)
- landmarks: Array of 6-point landmarks for each detected face
- Format:
[[x1, y1], [x2, y2], [x3, y3], [x4, y4], [x5, y5], [x6, y6]]
- Format:
- landmarks_str: String format of first 4 landmarks for Face Swap API compatibility
- Format:
"x1,y1:x2,y2:x3,y3:x4,y4"
- Format:
- region: Bounding boxes for each detected face
- Format:
[x, y, width, height]where (x, y) is the top-left corner
- Format:
- removed: Bounding boxes of faces no longer visible (video only)
- frame_time: Timestamp in seconds for this frame (video only, null for images)
- face_urls: Cropped face image URLs (only when
return_face_url=true) - crop_region: Cropping region in original image (only when
return_face_url=true) - crop_landmarks: Landmarks relative to cropped image (only when
return_face_url=true)
- landmarks: Array of 6-point landmarks for each detected face
Common Use Cases
Face Detection for Image Processing
For images, the
num_frames parameter is not needed and will be ignored.Get Cropped Face Images for Face Swap
Face Tracking in Video Content
Single Face Detection
Multiple Face Detection
The API automatically detects all faces in an image or video frame. No special configuration needed.Integration with Face Swap
- Use Face Detection API to get face landmarks and optionally cropped face URLs
- Pass the
landmarks_strvalue to Face Swap API as theoptsparameter - When using
return_face_url=true, usecrop_landmarksfor the cropped face image
Error Handling
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| ”Either ‘url’ or ‘img’ parameter must be provided” | Missing input | Provide either url or img parameter |
| ”Invalid URL format” | Malformed URL provided | Ensure URL is properly formatted with protocol (http/https) |
| “Failed to download media” | URL inaccessible or invalid | Verify URL is publicly accessible |
| ”No faces detected” | No faces found in media | Check image quality and face visibility |
| ”Failed to process media” | Media format not supported | Use supported formats (JPG, PNG, MP4, etc.) |
| ”Media type detection failed” | Unable to determine media type | Ensure file has proper extension or content-type |
Handling Failed Requests
Performance Considerations
Processing Time
- Images: Typically < 1 second
- Videos: Varies based on:
- Number of frames requested
- Video resolution
- Number of faces per frame
Rate Limits
Optimization Tips
- Use appropriate
num_framesvalue - more frames = longer processing time - Use
single_face=truewhen you only need one face - Cache results when processing the same media multiple times
- Process videos in batches if analyzing many videos