Last Updated on November 23, 2020
WebVTT (Web VideoText Tracks format) lets you associate a text track to a HTML5 video file. You can use this file to display captions or subtitles as a text overlay on video using the HTML track element. The text track is an external file that adds the text overlays set to specific time points you specify. This article includes a simple example of a video using a text track to display closed captions.
Adding a text track helps address accessibility issues involved with delivering video online. Text tracks add important details for people who are hard of hearing, or inaudible sound elements. Text tracks can improve how learners access and understand video. Many text track file formats are available for video. Choose the text track file format to best suit the video delivery method in use. Video in HTML5 supports WebVTT, which uses the .vtt file format.
There are two types of captions: open and closed. Closed captions are text overlays that users enable and disable on the video. Open captions are text overlays added to the video itself that users cannot turn on or off. We use closed captions in this article by adding an external text track file alongside a video file. You can upload .vtt files for each language your video supports and assign these subtitles in your code.
This article shows you how to add a text track for a video element, add cue settings, and adjust and style it using CSS. You will produce the following styled text track for a video.
Understanding the WebVTT file format structure
The file format structure of a WebVTT text track is simple. The primary components are as follows:
WEBVTT
Add this at the top of your file. You can opt to add a space after WEBVTT and add a name or other descriptive element on the same line.
Specify time intervals using cues. Each cue has a timestamp written using this format:
00:00:13.540 --> 00:00:21.280
Timestamps use a specific format expressed in hours:minutes:seconds.milliseconds and separated by -->.
Add the text track on the next line after the timestamps. This is the text that overlays the video during the specified time.
00:00:13.540 --> 00:00:21.280 This is some text that displays over the video.
You can place an optional sequence number or name above a cue.
1 00:00:13.540 --> 00:00:21.280 This is some text that displays over the video.
You need to add a blank line after each caption because the blank line signifies the end of a cue in a .vtt file.
Putting it all together produces a simple WebVTT file. Here is an example with three cues:
WEBVTT 1 00:00:00.000 --> 00:00:01.500 This is the very first text caption. 2 00:00:01.500 --> 00:00:05.222 It is a simple text track that goes along with a video, 3 00:00:05.222 --> 00:00:18.340 which makes it easier to understand without audio.
You can add notes, styles, and control the text track using other cue techniques and settings.
The W3 WebVTT spec is in progress, so styling does not consistently work in every browser. Research and test browser compatibility, such as referring to caniuse.com for current basic browser support.
The following examples describe styles that work in major browsers.
Creating or downloading a text track
To follow along with our examples, create or open a new .vtt file using a code editor.
You can:
- Use the sample described in Understanding the WebVTT file format structure
- Download a sample file
- Refer to the WebVTT spec to create your own file
- Use a tool that creates a text track for you
For example, upload a video to YouTube and use the platform’s auto-captioning feature. Then, download the text track using the YouTube admin and adjust the formatting. For steps to download captions, see this article.
You can choose from many existing sample files to follow along with the steps too. Find a public domain video and captions here, or search for an alternative.
Styling your video captions
This article contains two ways to style your WebVTT video track files. You learn how to style:
- words within a text track
- the entire text track for a video
To get started, embed the video and the caption file in a web page using the following HTML markup. Adjust the code for your video size, file location, and text track type (subtitles or captions).
<video id="the-video" preload="metadata" controls="controls" width="1280" height="530"> <source src="the-video.mp4" type="video/mp4" /> <track label="English" kind="captions" srclang="en" src="captions.vtt" default="" /> </video>
Aligning and positioning cues on a video
You can adjust position and alignment of a cue by adding cue settings within the .vtt file. Add the following settings after the timestamp to adjust the text overlay.
position:55%,line-right align:center size:25%
Place these settings after the timestamp in the .vtt file, and affects the text as shown in Figure 1.
4 00:00:13.540 --> 00:00:21.280 position:55%,line-right align:center size:25% This animation shows the tides as a complex system of rotating and trapped waves with a mixture of frequencies.
Adding inline styles to captions in a WebVTT file
Use the inline, bold, italic and underline tags within a text track file to change the appearance of individual words. These styles can help translate inflection or provide emphasis.
You can find an example of this within the sample .vtt file, as seen in the following code snippet:
WEBVTT 1 00:00:00.000 --> 00:00:03.520 Ocean tides are <i>not</i> simple. 2 00:00:03.720 --> 00:00:06.120 If our <b>planet</b> had no continents, 3 00:00:06.260 --> 00:00:13.340 tides would be hemispheric-sized <u>bulges</u> of water moving westward with the moon and sun.
Tags are useful to style single words (see Figure 2). However, you can also write CSS to target and style the entire text track.
Adding CSS to change style attributes for a target video
Add the following style to your website CSS. The code changes the caption background color to a transparent to solid red gradient, and the text to an “attractive” yellow 22pt courier font.
#the-video::cue { background: linear-gradient(to right, rgba(255,0,0,0), rgba(255,0,0,1)); color: #ffff00; font-size: 22px; font-family: "courier"; }
You can see what the default track-wide styling looked like in Figures 1 and 2. After we apply the CSS code to the site, the entire track styling changes (see Figure 3).
CSS changes the entire text track for a video, and retains the bold, italic and underlined tags from the .vtt file. See the finished video with all styles at the beginning of this article.
Video:
Public domain video and captions by NASA’s Scientific Visualization Studio. No affiliation with the creators of this video animation.
About the video:
This animation shows the barotropic global ocean tides as a complex system of rotating and trapped waves with a mixture of frequencies.