> ## Documentation Index
> Fetch the complete documentation index at: https://docs.akool.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Detect Faces

> Unified endpoint to detect faces in either video or image from URL or base64-encoded image

<Info>
  Detect faces in images and videos with 6-point landmarks. Supports URL and base64 input, with optional cropped face URLs.
</Info>

## Request Parameters

| Parameter         | Type    | Required | Default | Description                              |
| ----------------- | ------- | -------- | ------- | ---------------------------------------- |
| `url`             | string  | Yes\*    | -       | Image or video URL (publicly accessible) |
| `img`             | string  | Yes\*    | -       | Base64-encoded image data                |
| `num_frames`      | integer | No       | 5       | Number of frames for video analysis      |
| `return_face_url` | boolean | No       | false   | Return cropped face image URLs           |
| `single_face`     | boolean | No       | false   | Return only the largest face             |

<Note>
  Either `url` or `img` must be provided. If both are provided, `url` takes priority.
</Note>

## Response Format

```json theme={null}
{
  "error_code": 0,
  "error_msg": "SUCCESS",
  "faces_obj": {
    "0": {
      "landmarks": [[[120, 85], [180, 88], [150, 130], [150, 165], [125, 165], [175, 168]]],
      "landmarks_str": ["120,85:180,88:150,130:150,165"],
      "region": [[80, 50, 150, 180]],
      "removed": [],
      "frame_time": null,
      "face_urls": null,
      "crop_region": null,
      "crop_landmarks": null
    }
  }
}
```

### Response Fields

| Field        | Type    | Description                                     |
| ------------ | ------- | ----------------------------------------------- |
| `error_code` | integer | 0 = success, 1 = error                          |
| `error_msg`  | string  | Status message                                  |
| `faces_obj`  | object  | Face data keyed by frame index ("0", "1", etc.) |

### Face Frame Data

| Field            | Type        | Description                                                                                                     |
| ---------------- | ----------- | --------------------------------------------------------------------------------------------------------------- |
| `landmarks`      | array       | 6-point landmarks for each face: Left Eye, Right Eye, Nose, Mouth Center, Left Mouth Corner, Right Mouth Corner |
| `landmarks_str`  | array       | First 4 landmarks as string format `"x1,y1:x2,y2:x3,y3:x4,y4"` - use this for Face Swap `opts` parameter        |
| `region`         | array       | Bounding boxes `[x, y, width, height]`                                                                          |
| `removed`        | array       | Faces no longer visible (video only)                                                                            |
| `frame_time`     | number/null | Timestamp in seconds (video only, null for images)                                                              |
| `face_urls`      | array/null  | Cropped face URLs (when `return_face_url=true`)                                                                 |
| `crop_region`    | array/null  | Crop region in original image coordinates                                                                       |
| `crop_landmarks` | array/null  | Landmarks relative to cropped image                                                                             |

## Examples

### Example 1: Basic Image Detection

**Request:**

```json theme={null}
{
  "url": "https://example.com/photo.jpg"
}
```

**Response:**

```json theme={null}
{
  "error_code": 0,
  "error_msg": "SUCCESS",
  "faces_obj": {
    "0": {
      "landmarks": [[[320, 240], [420, 240], [370, 300], [370, 350], [340, 350], [400, 350]]],
      "landmarks_str": ["320,240:420,240:370,300:370,350"],
      "region": [[300, 200, 150, 180]],
      "removed": [],
      "frame_time": null
    }
  }
}
```

### Example 2: Video Detection

**Request:**

```json theme={null}
{
  "url": "https://example.com/video.mp4",
  "num_frames": 10
}
```

### Example 3: Get Cropped Face URLs

**Request:**

```json theme={null}
{
  "url": "https://example.com/photo.jpg",
  "return_face_url": true
}
```

**Response:**

```json theme={null}
{
  "error_code": 0,
  "error_msg": "SUCCESS",
  "faces_obj": {
    "0": {
      "landmarks": [[[120, 85], [180, 88], [150, 130], [150, 165], [125, 165], [175, 168]]],
      "landmarks_str": ["120,85:180,88:150,130:150,165"],
      "region": [[80, 50, 150, 180]],
      "removed": [],
      "frame_time": null,
      "face_urls": ["https://s3.example.com/faces/face_detect_0.jpg"],
      "crop_region": [[30, 0, 250, 280]],
      "crop_landmarks": ["90,85:150,88:120,130:120,165"]
    }
  }
}
```

### Example 4: Single Face Mode

**Request:**

```json theme={null}
{
  "url": "https://example.com/group_photo.jpg",
  "single_face": true
}
```

## Error Responses

| error\_code | error\_msg                                       | Description     |
| ----------- | ------------------------------------------------ | --------------- |
| 0           | SUCCESS                                          | Success         |
| 1           | Either 'url' or 'img' parameter must be provided | Missing input   |
| 1           | Invalid URL format                               | Bad URL         |
| 1           | Failed to download media from URL                | Download failed |

## Integration with Face Swap

Use `landmarks_str` directly as the `opts` parameter in Face Swap API:

```python theme={null}
import requests

# Detect faces
response = requests.post(
    "https://openapi.akool.com/interface/detect-api/detect_faces",
    json={"url": "https://example.com/target.jpg", "return_face_url": True},
    headers={"x-api-key": "YOUR_API_KEY"}
)

result = response.json()
if result["error_code"] == 0:
    frame_data = result["faces_obj"]["0"]
    # Use for Face Swap API
    target_image = {
        "path": frame_data["face_urls"][0],
        "opts": frame_data["crop_landmarks"][0]
    }
```

## Best Practices

* **URL Requirements**: Use HTTPS, ensure publicly accessible
* **Video num\_frames**: Short videos (5-10), Medium (10-20), Long (20-50)
* **Performance**: Use `single_face=true` when only one face needed


## OpenAPI

````yaml POST /detect_faces
openapi: 3.0.3
info:
  title: Face Detection API
  description: >
    API for detecting faces in images and videos with bounding boxes and 6-point
    landmarks.


    This API provides:

    - Unified face detection for both images and videos

    - Face tracking across video frames

    - 6-point facial landmarks detection

    - Bounding box coordinates for each detected face

    - Cropped face image URLs with landmarks (optional)

    - Single face mode for returning only the largest face

    - Multi-frame face analysis with person deduplication
  version: 1.0.0
servers:
  - url: https://openapi.akool.com/interface/detect-api
    description: Face detection server
security:
  - ApiKeyAuth: []
  - BearerAuth: []
paths:
  /detect_faces:
    post:
      tags:
        - Face Detection
      summary: Detect Faces in Video or Image
      description: >
        Unified endpoint to detect faces in either video or image from URL or
        base64-encoded image data.


        This endpoint:

        1. Auto-detects media type (video/image) based on URL

        2. Downloads media from the provided URL asynchronously (or decodes
        base64 image)

        3. Processes media (extracts frames for video, loads image for image)

        4. Detects faces using InsightFace with face tracking for videos

        5. Returns bounding boxes and 6-point landmarks for each detected face

        6. For videos, tracks faces across frames and marks previous positions
        as removed

        7. Optionally returns cropped face image URLs (when
        return_face_url=true)

        8. Optionally returns only the largest face (when single_face=true)


        **Input Modes**:

        - URL mode: provide `url` parameter with image/video URL

        - Base64 mode: provide `img` parameter with base64-encoded image data

        - If both are provided, `url` takes priority
      operationId: detectFaces
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/UnifiedFaceDetectionRequest'
            examples:
              image_detection:
                summary: Image Face Detection (num_frames not required)
                value:
                  url: https://example.com/image.jpg
              image_base64:
                summary: Image Face Detection with Base64
                value:
                  img: data:image/jpeg;base64,/9j/4AAQSkZJRg...
              video_detection:
                summary: Video Face Detection (num_frames recommended)
                value:
                  url: https://example.com/video.mp4
                  num_frames: 10
              with_face_url:
                summary: Get Cropped Face Images
                value:
                  url: https://example.com/image.jpg
                  return_face_url: true
              single_face_mode:
                summary: Return Only Largest Face
                value:
                  url: https://example.com/group_photo.jpg
                  single_face: true
      responses:
        '200':
          description: Face detection completed successfully
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FaceDetectionResponse'
              examples:
                image_success:
                  summary: Successful Image Detection
                  value:
                    error_code: 0
                    error_msg: SUCCESS
                    faces_obj:
                      '0':
                        landmarks:
                          - - - 120
                              - 85
                            - - 180
                              - 88
                            - - 150
                              - 130
                            - - 150
                              - 165
                            - - 125
                              - 165
                            - - 175
                              - 168
                        landmarks_str:
                          - 120,85:180,88:150,130:150,165
                        region:
                          - - 80
                            - 50
                            - 150
                            - 180
                        removed: []
                        frame_time: null
                        face_urls: null
                        crop_region: null
                        crop_landmarks: null
                video_success:
                  summary: Successful Video Detection
                  value:
                    error_code: 0
                    error_msg: SUCCESS
                    faces_obj:
                      '0':
                        landmarks:
                          - - - 320
                              - 240
                            - - 420
                              - 240
                            - - 370
                              - 300
                            - - 370
                              - 350
                            - - 340
                              - 350
                            - - 400
                              - 350
                        landmarks_str:
                          - 320,240:420,240:370,300:370,350
                        region:
                          - - 300
                            - 200
                            - 150
                            - 180
                        removed: []
                        frame_time: 0
                        face_urls: null
                        crop_region: null
                        crop_landmarks: null
                      '5':
                        landmarks:
                          - - - 325
                              - 245
                            - - 425
                              - 245
                            - - 375
                              - 305
                            - - 375
                              - 355
                            - - 345
                              - 355
                            - - 405
                              - 355
                        landmarks_str:
                          - 325,245:425,245:375,305:375,355
                        region:
                          - - 305
                            - 205
                            - 150
                            - 180
                        removed:
                          - - 300
                            - 200
                            - 150
                            - 180
                        frame_time: 0.2
                        face_urls: null
                        crop_region: null
                        crop_landmarks: null
                with_face_urls:
                  summary: Response with Cropped Face URLs
                  value:
                    error_code: 0
                    error_msg: SUCCESS
                    faces_obj:
                      '0':
                        landmarks:
                          - - - 120
                              - 85
                            - - 180
                              - 88
                            - - 150
                              - 130
                            - - 150
                              - 165
                            - - 125
                              - 165
                            - - 175
                              - 168
                        landmarks_str:
                          - 120,85:180,88:150,130:150,165
                        region:
                          - - 80
                            - 50
                            - 150
                            - 180
                        removed: []
                        frame_time: null
                        face_urls:
                          - https://s3.example.com/faces/face_detect_0.jpg
                        crop_region:
                          - - 30
                            - 0
                            - 250
                            - 280
                        crop_landmarks:
                          - 90,85:150,88:120,130:120,165
        '400':
          description: Bad request - Invalid input parameters
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FaceDetectionResponse'
              example:
                error_code: 1
                error_msg: Either 'url' or 'img' parameter must be provided
                faces_obj: {}
        '500':
          description: Internal server error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FaceDetectionResponse'
              example:
                error_code: 1
                error_msg: Failed to process media
                faces_obj: {}
components:
  schemas:
    UnifiedFaceDetectionRequest:
      type: object
      properties:
        url:
          type: string
          format: uri
          description: >
            URL of the video or image to process. The media type will be
            auto-detected based on the file extension.

            Either `url` or `img` must be provided. If both are provided, `url`
            takes priority.
          example: https://example.com/media.mp4
        img:
          type: string
          description: >
            Base64-encoded image data. Supports both plain base64 string and
            data URI format (e.g., "data:image/jpeg;base64,...").

            Either `url` or `img` must be provided. If both are provided, `url`
            takes priority.
          example: data:image/jpeg;base64,/9j/4AAQSkZJRg...
        num_frames:
          type: integer
          minimum: 1
          maximum: 100
          default: 5
          description: >-
            Number of frames to extract and analyze (only used for videos,
            ignored for images)
          example: 5
        return_face_url:
          type: boolean
          default: false
          description: >
            Whether to return cropped face image URLs. When set to `true`, the
            response will include:

            - `face_urls`: URLs of cropped face images

            - `crop_region`: The region used for cropping in original image
            coordinates

            - `crop_landmarks`: Landmarks relative to the cropped image
          example: false
        single_face:
          type: boolean
          default: false
          description: >
            When set to `true`, only returns the largest face (by area) in each
            frame.

            Useful when you only need the main/primary face in the image or
            video.
          example: false
    FaceDetectionResponse:
      type: object
      required:
        - error_code
        - error_msg
        - faces_obj
      properties:
        error_code:
          type: integer
          description: 'Error code (0: success, 1: error)'
          example: 0
        error_msg:
          type: string
          description: Error message or success message
          example: SUCCESS
        faces_obj:
          type: object
          description: >
            Dictionary of face detection results keyed by frame index (as
            string).

            For images, only frame "0" will be present.

            For videos, multiple frames will be present (e.g., "0", "5", "10",
            etc.)
          additionalProperties:
            $ref: '#/components/schemas/FaceFrameData'
          example:
            '0':
              landmarks:
                - - - 120
                    - 85
                  - - 180
                    - 88
                  - - 150
                    - 130
                  - - 150
                    - 165
                  - - 125
                    - 165
                  - - 175
                    - 168
              landmarks_str:
                - 120,85:180,88:150,130:150,165
              region:
                - - 80
                  - 50
                  - 150
                  - 180
              removed: []
              frame_time: null
              face_urls: null
              crop_region: null
              crop_landmarks: null
    FaceFrameData:
      type: object
      required:
        - landmarks
        - region
        - removed
      properties:
        landmarks:
          type: array
          description: >
            List of 6-point facial landmarks for each detected face.

            Each face has 6 landmark points: Left Eye, Right Eye, Nose, Mouth
            Center, Left Mouth Corner, Right Mouth Corner.

            Format: [[[x1, y1], [x2, y2], [x3, y3], [x4, y4], [x5, y5], [x6,
            y6]], ...]
          items:
            type: array
            items:
              type: array
              items:
                type: integer
          example:
            - - - 120
                - 85
              - - 180
                - 88
              - - 150
                - 130
              - - 150
                - 165
              - - 125
                - 165
              - - 175
                - 168
        landmarks_str:
          type: array
          description: >
            String representation of the first 4 landmarks (Left Eye, Right Eye,
            Nose, Left Mouth Corner).

            Format: ["x1,y1:x2,y2:x3,y3:x4,y4"]

            This format is directly compatible with the Face Swap API `opts`
            parameter.
          items:
            type: string
          example:
            - 120,85:180,88:150,130:125,165
        region:
          type: array
          description: |
            List of bounding boxes for each detected face.
            Format: [[x, y, width, height], ...]
            Where (x, y) is the top-left corner of the bounding box
          items:
            type: array
            items:
              type: integer
          example:
            - - 80
              - 50
              - 150
              - 180
        removed:
          type: array
          description: >
            List of face regions that were detected in previous frames but are
            no longer present (for video tracking).

            Format: [[x, y, width, height], ...]

            Empty array for images or first frame of videos
          items:
            type: array
            items:
              type: integer
          example: []
        frame_time:
          type: number
          format: float
          nullable: true
          description: Time in seconds for this frame in the video. Null for images.
          example: null
        face_urls:
          type: array
          nullable: true
          description: >
            URLs of cropped face images. Only returned when
            `return_face_url=true`.

            Each URL corresponds to a detected face in the same order as
            `region` and `landmarks`.
          items:
            type: string
            nullable: true
          example:
            - https://s3.example.com/faces/face_detect_0.jpg
        crop_region:
          type: array
          nullable: true
          description: >
            The cropped image region in original image coordinates. Only
            returned when `return_face_url=true`.

            Format: [[x, y, width, height], ...]

            This is the expanded region used for cropping (larger than `region`
            for better context).
          items:
            type: array
            items:
              type: integer
            nullable: true
          example:
            - - 30
              - 0
              - 250
              - 280
        crop_landmarks:
          type: array
          nullable: true
          description: >
            Landmarks relative to the cropped image. Only returned when
            `return_face_url=true`.

            Format: ["x1,y1:x2,y2:x3,y3:x4,y4", ...]

            Use this when working with the cropped face image.
          items:
            type: string
            nullable: true
          example:
            - 90,85:150,88:120,130:95,165
  securitySchemes:
    ApiKeyAuth:
      type: apiKey
      in: header
      name: x-api-key
      description: >-
        Your API Key used for request authorization. If both Authorization and
        x-api-key have values, Authorization will be used first and x-api-key
        will be discarded.
    BearerAuth:
      type: http
      scheme: bearer
      description: >-
        Your API Key used for request authorization. Get Token from
        authentication/usage#get-the-token

````