Requesting Auto Captions

Introduction

Brightcove Auto Captioning is a platform-level service that allows you to automatically generate captions for new or existing videos in 31 different languages (provided you have an audio track for the language specified). Like all speech-to-text services Auto Captioning is not 100% accurate, but it provides a quick and easy way to generate captions right in Video Cloud.

Video Cloud uses the following process to determine the source that will be used to generate the captions.

If the video has a default audio track, that will be used as the captions source file (supported by default in the Media module)
If the video has no default audio track but a master/mezzanine file exists, that will be used as the source file (supported by default in the Media module)
If the video has no default audio track or master/mezzanine files, but audio tracks are specified in the Dynamic Ingest call, the specified audio track will be used (not yet supported in the Media module)
If the video has no default audio track, no master/mezzanine files, and no audio track is specified, captions cannot be generated

Notes:

Auto captions is available to all Video Cloud accounts. Auto captions for a number of minutes of audio will be provided free of charge; there will be a charge for additional minutes if you need them. Contact your Customer Success Manager for details and to have auto-captioning enabled for your account.
Auto captions can also be requested in the Video Cloud upload module - see Automatically Generating Captions for Videos.
Auto captions cannot be generated for existing videos unless the videos have a digital master.
The minimum length of a video to generate auto captions is:
- 5 seconds, if the language is specified
- 15 seconds, if the language is auto-detected

Setup

The setup for Dynamic Ingest requests is the same, whether you are ingesting a video, images, audio tracks, WebVTT files, requesting auto captions, or all of these:

Request URL

https://ingest.api.brightcove.com/v1/accounts/{{account_id}}/videos/{{video_id}}/ingest-requests

Authentication

Authentication requires an access token passed as a Bearer token in an Authorization header:

          Authorization: Bearer {access_token}

To get access tokens, you will need client credentials (see below). For the process of obtaining an access token, see Get Access Tokens.

Note on S3

If your source files will be pulled from a protected S3 bucket, you will need to set a bucket policy to allow Video Cloud to access the files. See Using Dynamic Ingest with S3 for details.

Getting Credentials

To get a client_id and client_secret, you will need to go to the OAuth UI and register this app:

Managing API Authentication Credentials

These are the permissions you will need:

You can also get your credentials via CURL, Postman, or our online app - see:

If you are getting credentials directly from the API, these are the permissions you need:

[
          "video-cloud/video/all",
          "video-cloud/ingest-profiles/profile/read",
          "video-cloud/ingest-profiles/account/read",
          "video-cloud/upload-urls/read"
          ]

Use cases

Here are some typical use cases

Create auto-captions for new ingestions or retranscoding

Request body

{
  "master": {
    "use_archived_master": true
  },
  "profile": "multi-platform-standard-static-with-mp4",
  "transcriptions": [
    {
      "srclang": "EN-us",
      "kind": "captions",
      "label": "English",
      "status": "published",
      "default": true
    }
  ],
  "priority": "normal"
}

Create auto-captions when ingesting an audio track

Request body

{
  "audio_tracks": {
    "merge_with_existing": true,
    "masters": [
      {
        "language": "fr-FR",
        "variant": "alternate",
        "url": "https://support.brightcove.com/test-assets//audio/celtic_lullaby.m4a"
      }
    ]
  },
  "transcriptions": [
    {
      "srclang": "fr-FR",
      "kind": "captions",
      "label": "french-FR",
      "status": "published",
      "default": false,
      "input_audio_track": {
        "language": "fr-FR",
        "variant": "alternate"
      }
    }
  ]
}

Create auto-captions for an existing video using the digital master

Request body

{
  "transcriptions": [
    {
      "srclang": "fr-FR",
      "kind": "captions",
      "label": "french-FR",
      "default": false
    }
  ]
}

Create auto-captions for an existing video defining the audio tracks

Note: to get information about the videos audio tracks, you can either:

Look at the video properties in Studio

Use the CMS API to retrieve the audio tracks using this endpoint:

https://cms.api.brightcove.com/v1/accounts/{account_id}/videos/{video_id}/audio_tracks

Request body

{
  "transcriptions": [
    {
      "srclang": "en-US",
      "kind": "captions",
      "label": "english-EN",
      "default": false,
      "input_audio_track": {
        "language": "en-US",
        "variant": "main"
      }
    },
    {
      "srclang": "fr-FR",
      "kind": "captions",
      "label": "french-FR",
      "default": false,
      "input_audio_track": {
        "language": "fr-FR",
        "variant": "alternate"
      }
    }
  ]
}

Request body fields for auto captions

The table below shows the request body fields for auto captions.

Fields for Auto Captions and Transcripts
Field	Type	Required	Description
`autodetect`	boolean	no	`true` to auto-detect language from audio source. `false` to use `srclang` specifying the audio language. Notes: If `autodetect` is `true`, `srclang` must not be included. If `autodetect` is `false`, and `srclang` is not included, the request will fail.
`default`	boolean	no	If `true`, `srclang` will be ignored, and the main audio track will be used - language will be auto-detected.
`input_audio_track`	object	no	For multiple audio tracks, defines the audio to extract the captions from. It is composed by language and variant (both required).
`kind`	string	no	The kind of output to generate. Allowed values: `captions` `transcripts` Notes: If the `kind` is `transcripts`, and the `url` for the transcripts file is included, a transcript file will be ingested, and no auto captions will be generated. See Ingesting Transcriptions for more details. If the `kind` is `transcripts`, and the `url` for the transcripts file is not included, a transcript file and captions will be generated, If the `kind` is `captions`, a captions will be generated, but not a transcript file.
`label`	string	no	Human readable label. Defaults to the BCP-47 style language code.
`srclang`	string	no	BCP-47 style language code for the text tracks (en-US, fr-FR, es-ES, etc.); see supported languages
`status`	string	no	Indicates the actual situation of the caption, if it is `published`, `draft`.
`url`	string	no	The URL where a transcript file is located. Must be included in the `kind` is `transcripts`. Must not be included if the `kind` is `captions`.

`input_audio_track` fields

input_audio_track Fields
Field	Type	Required	Description
`language`	string	yes	BCP-47 style language code for the text tracks (en-US, fr-FR, es-ES, etc.); see supported languages
`variant`	string	yes	Specifies the variant to use: `main` `alternate` `dub` `commentary` `descriptive`

Supported languages

Currently, auto captions are limited to the following languages

Australian English (en-AU)
Afrikaans (af-ZA)
Brazilian Portuguese (pt-BR)
British English (en-GB)
Canadian French (fr-CA)
Danish (da-DK)
Dutch (nl-NL)
Farsi Persian (fa-IR)
French (fr-FR)
German (de-DE)
Gulf Arabic (ar-AE)
Hebrew (he-IL)
Indian English (en-IN)
Indian Hindi (hi-IN)
Indonesian (id-ID)
Irish English (en-IE)
Italian (it-IT)
Japanese (ja-JP)

Korean (ko-KR)
Malay (ms-MY)
Mandarin Chinese, Mainland (zh-CN)
Mandarin Chinese, Taiwan (zh-TW)
Modern Standard Arabic (ar-SA)
New Zealand English (en-NZ)
Portuguese (pt-PT)
Russian (ru-RU)
Scottish English (en-AB)
South African English (en-ZA)
Spanish (es-ES)
Swiss German (de-CH)
Tamil (ta-IN)
Telugu (te-IN)
Thai (th-TH)
Turkish (tr-TR)
US English (en-US)
US Spanish (es-US)
Welsh English (en-WL)

Requesting Auto Captions

Introduction

Setup

Note on S3

Getting Credentials

Use cases

Create auto-captions for new ingestions or retranscoding

Create auto-captions when ingesting an audio track

Create auto-captions for an existing video using the digital master

Create auto-captions for an existing video defining the audio tracks

Request body fields for auto captions

input_audio_track fields

Supported languages

`input_audio_track` fields