> ## Documentation Index
> Fetch the complete documentation index at: https://docs.akool.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Face Detection API Overview

> Comprehensive guide to the Face Detection API

<Info>
  Detect faces in images and videos with high accuracy. Get bounding boxes, 6-point landmarks, cropped face images, and face tracking for video content.
</Info>

## API Endpoints

### Face Detection Operations

* [Detect Faces](/ai-tools-suite/face-detection/detect-faces) - Unified endpoint for face detection in images and videos (auto-detects media type)
* [Analyze Frames](/ai-tools-suite/face-detection/analyze-frames) - Multi-frame face analysis with person deduplication for face swap preparation

## Getting Started

### Basic Workflow

1. **For Image Face Detection**:
   * Call the [Detect Faces API](/ai-tools-suite/face-detection/detect-faces) with an image URL or base64-encoded image
   * Only the `url` or `img` parameter is required (no need for `num_frames`)
   * Receive bounding boxes and 6-point landmarks for all detected faces
   * Optionally get cropped face image URLs with `return_face_url=true`
   * Use the landmark data for downstream tasks (e.g., face swap, face recognition)

2. **For Video Face Detection**:
   * Call the [Detect Faces API](/ai-tools-suite/face-detection/detect-faces) with a video URL
   * Specify the `num_frames` parameter to control how many frames to analyze (default: 5)
   * Get face tracking data across frames with removed face positions

### Response Code Description

<Note>
  Error code 0 indicates success. Any non-zero error code indicates a failure.
  Check the error\_msg field for detailed error information.
</Note>

| Code | Description                          |
| ---- | ------------------------------------ |
| 0    | Success                              |
| 1    | Error - Check error\_msg for details |

## Features

### Dual Input Modes

The API supports two ways to provide image input:

1. **URL Mode**: Provide a publicly accessible URL to an image or video
2. **Base64 Mode**: Provide base64-encoded image data (with or without data URI prefix)

```json theme={null}
// URL mode
{ "url": "https://example.com/image.jpg" }

// Base64 mode
{ "img": "data:image/jpeg;base64,/9j/4AAQSkZJRg..." }
```

### 6-Point Facial Landmarks

The API detects 6 key facial landmarks for each face:

1. **Left Eye** - Center point of the left eye
2. **Right Eye** - Center point of the right eye
3. **Nose Tip** - Tip of the nose
4. **Mouth Center** - Center point of the mouth (X-axis midpoint between mouth corners)
5. **Left Mouth Corner** - Left corner of the mouth
6. **Right Mouth Corner** - Right corner of the mouth

### Cropped Face Images

When `return_face_url=true`, the API returns:

* **face\_urls**: URLs to cropped face images stored in cloud storage
* **crop\_region**: The region coordinates used for cropping
* **crop\_landmarks**: Landmarks relative to the cropped image

This is particularly useful for face swap operations where you need both the face image and its landmarks.

### Single Face Mode

When `single_face=true`, the API returns only the largest face (by area) in each frame. This is useful for:

* Portrait photos where you only care about the main subject
* ID photos with a single person
* Reducing response size when multiple faces are detected

### Face Tracking for Videos

For video content, the API provides advanced face tracking:

* **Persistent Face IDs** - Tracks the same face across multiple frames
* **Removed Faces** - Identifies faces that were present in previous frames but are no longer visible
* **Frame Timing** - Provides timestamp information for each frame

### Auto Media Type Detection

The API automatically detects whether the input is an image or video based on:

* File extension (`.jpg`, `.png`, `.mp4`, `.mov`, etc.)
* Content-Type header from the URL
* Fallback to content analysis if needed

## Best Practices

### Image Requirements

* **Quality**: Use high-resolution images for better detection accuracy
* **Face Visibility**: Ensure faces are clearly visible and not obscured
* **Lighting**: Well-lit images produce better detection results
* **Angle**: Frontal or slight angle faces work best (±45 degrees)
* **Size**: Face size should be at least 80x80 pixels

### Video Requirements

* **Duration**: Shorter videos process faster
* **Frame Rate**: Standard frame rates (24-30 fps) are optimal
* **Resolution**: 720p or higher recommended for best results
* **Face Count**: API can detect multiple faces per frame
* **Encoding**: Use standard encoding formats (H.264 recommended)

### API Usage Tips

* **Parameter Usage**:
  * **For Images**: Only `url` or `img` parameter is required. The `num_frames` parameter is NOT needed.
  * **For Videos**: Both `url` and `num_frames` parameters are recommended.
* **Frame Selection** (for videos only):
  * Short videos (\< 10s): 5-10 frames
  * Medium videos (10-30s): 10-20 frames
  * Long videos (> 30s): 20-50 frames
* **URL Accessibility**: Ensure the media URL is publicly accessible
* **Supported Formats**:
  * Images: JPG, JPEG, PNG, BMP, WEBP
  * Videos: MP4, MOV, AVI, WEBM

## Understanding the Response

### Response Structure

```json theme={null}
{
  "error_code": 0,
  "error_msg": "SUCCESS",
  "faces_obj": {
    "0": {
      "landmarks": [
        [[100, 120], [150, 120], [125, 150], [125, 180], [110, 180], [140, 180]]
      ],
      "landmarks_str": [
        "100,120:150,120:125,150:125,180"
      ],
      "region": [[80, 100, 100, 120]],
      "removed": [],
      "frame_time": null,
      "face_urls": null,
      "crop_region": null,
      "crop_landmarks": null
    }
  }
}
```

### Field Descriptions

* **error\_code**: Status code (0 = success)
* **error\_msg**: Status message or error description
* **faces\_obj**: Dictionary keyed by frame index (as string)
  * **landmarks**: Array of 6-point landmarks for each detected face
    * Format: `[[x1, y1], [x2, y2], [x3, y3], [x4, y4], [x5, y5], [x6, y6]]`
  * **landmarks\_str**: String format of first 4 landmarks for Face Swap API compatibility
    * Format: `"x1,y1:x2,y2:x3,y3:x4,y4"`
  * **region**: Bounding boxes for each detected face
    * Format: `[x, y, width, height]` where (x, y) is the top-left corner
  * **removed**: Bounding boxes of faces no longer visible (video only)
  * **frame\_time**: Timestamp in seconds for this frame (video only, null for images)
  * **face\_urls**: Cropped face image URLs (only when `return_face_url=true`)
  * **crop\_region**: Cropping region in original image (only when `return_face_url=true`)
  * **crop\_landmarks**: Landmarks relative to cropped image (only when `return_face_url=true`)

## Common Use Cases

### Face Detection for Image Processing

```json theme={null}
{
  "url": "https://example.com/portrait.jpg"
}
```

**Use case**: Detect faces in a portrait photo for face alignment, face recognition, or face swap preprocessing.

<Note>
  For images, the `num_frames` parameter is not needed and will be ignored.
</Note>

### Get Cropped Face Images for Face Swap

```json theme={null}
{
  "url": "https://example.com/photo.jpg",
  "return_face_url": true
}
```

**Use case**: Get cropped face images with their landmarks for direct use in Face Swap API.

### Face Tracking in Video Content

```json theme={null}
{
  "url": "https://example.com/video.mp4",
  "num_frames": 15
}
```

Use case: Track faces across video frames for video editing, face swap in videos, or facial animation.

### Single Face Detection

```json theme={null}
{
  "url": "https://example.com/group_photo.jpg",
  "single_face": true
}
```

Use case: Get only the main/largest face from a group photo.

### Multiple Face Detection

The API automatically detects all faces in an image or video frame. No special configuration needed.

### Integration with Face Swap

1. Use Face Detection API to get face landmarks and optionally cropped face URLs
2. Pass the `landmarks_str` value to Face Swap API as the `opts` parameter
3. When using `return_face_url=true`, use `crop_landmarks` for the cropped face image

## Error Handling

### Common Errors

| Error Message                                      | Cause                          | Solution                                                    |
| -------------------------------------------------- | ------------------------------ | ----------------------------------------------------------- |
| "Either 'url' or 'img' parameter must be provided" | Missing input                  | Provide either `url` or `img` parameter                     |
| "Invalid URL format"                               | Malformed URL provided         | Ensure URL is properly formatted with protocol (http/https) |
| "Failed to download media"                         | URL inaccessible or invalid    | Verify URL is publicly accessible                           |
| "No faces detected"                                | No faces found in media        | Check image quality and face visibility                     |
| "Failed to process media"                          | Media format not supported     | Use supported formats (JPG, PNG, MP4, etc.)                 |
| "Media type detection failed"                      | Unable to determine media type | Ensure file has proper extension or content-type            |

### Handling Failed Requests

```python theme={null}
# Example error handling in Python

# For image detection (no num_frames needed)
response = requests.post(
    "https://openapi.akool.com/interface/detect-api/detect_faces",
    json={"url": "https://example.com/image.jpg"},
    headers={"x-api-key": "YOUR_API_KEY"}
)

# For video detection (num_frames recommended)
# response = requests.post(
#     "https://openapi.akool.com/interface/detect-api/detect_faces",
#     json={"url": "https://example.com/video.mp4", "num_frames": 10},
#     headers={"x-api-key": "YOUR_API_KEY"}
# )

result = response.json()
if result["error_code"] != 0:
    print(f"Error: {result['error_msg']}")
else:
    faces = result["faces_obj"]
    print(f"Detected {len(faces['0']['landmarks'])} faces")
```

## Performance Considerations

### Processing Time

* **Images**: Typically \< 1 second
* **Videos**: Varies based on:
  * Number of frames requested
  * Video resolution
  * Number of faces per frame

### Rate Limits

<Warning>
  Rate limits apply to all API endpoints. Please refer to your account settings for specific limits.
</Warning>

### Optimization Tips

* Use appropriate `num_frames` value - more frames = longer processing time
* Use `single_face=true` when you only need one face
* Cache results when processing the same media multiple times
* Process videos in batches if analyzing many videos

## Support

For additional help and examples, check out:

* [Authentication Guide](/authentication/usage)
* [API Error Codes](/ai-tools-suite/error-code)
* [Face Swap Integration](/ai-tools-suite/faceswap)

Need help? Contact us at [info@akool.com](mailto:info@akool.com)
