Table of contents
      1. 4.8.2 The iframe element
      2. 4.8.3 The embed element
      3. 4.8.4 The object element
      4. 4.8.5 The param element
      5. 4.8.6 The video element
      6. 4.8.7 The audio element
      7. 4.8.8 The source element
      8. 4.8.9 The track element
      9. 4.8.10 Media elements
        1. 4.8.10.1 Error codes
        2. 4.8.10.2 Location of the media resource
        3. 4.8.10.3 MIME types
        4. 4.8.10.4 Network states
        5. 4.8.10.5 Loading the media resource
        6. 4.8.10.6 Offsets into the media resource
        7. 4.8.10.7 The ready states
        8. 4.8.10.8 Playing the media resource
        9. 4.8.10.9 Seeking
        10. 4.8.10.10 Media resources with multiple media tracks
          1. 4.8.10.10.1 TrackList objects
          2. 4.8.10.10.2 Selecting specific audio and video tracks declaratively
        11. 4.8.10.11 Synchronising multiple media elements
          1. 4.8.10.11.1 Introduction
          2. 4.8.10.11.2 Media controllers
          3. 4.8.10.11.3 Assigning a media controller declaratively
        12. 4.8.10.12 Timed text tracks
          1. 4.8.10.12.1 Text track model
          2. 4.8.10.12.2 Sourcing in-band text tracks
          3. 4.8.10.12.3 Sourcing out-of-band text tracks
          4. 4.8.10.12.4 Text track API
          5. 4.8.10.12.5 Event definitions
        13. 4.8.10.13 User interface
        14. 4.8.10.14 Time ranges
        15. 4.8.10.15 Event summary
        16. 4.8.10.16 Security and privacy considerations
        17. 4.8.10.17 Best practices for authors using media elements
        18. 4.8.10.18 Best practices for implementors of media elements

4.8.2 The iframe element

Categories
Flow content.
Phrasing content.
Embedded content.
Interactive content.
Contexts in which this element can be used:
Where embedded content is expected.
Content model:
Text that conforms to the requirements given in the prose.
Content attributes:
Global attributes
src
srcdoc
name
sandbox
seamless
width
height
DOM interface:
interface HTMLIFrameElement : HTMLElement {
           attribute DOMString src;
           attribute DOMString srcdoc;
           attribute DOMString name;
  [PutForwards=value] readonly attribute DOMSettableTokenList sandbox;
           attribute boolean seamless;
           attribute DOMString width;
           attribute DOMString height;
  readonly attribute Document contentDocument;
  readonly attribute WindowProxy contentWindow;
};

The iframe element represents a nested browsing context.

The src attribute gives the address of a page that the nested browsing context is to contain. The attribute, if present, must be a valid non-empty URL potentially surrounded by spaces.

The srcdoc attribute gives the content of the page that the nested browsing context is to contain. The value of the attribute is the source of an iframe srcdoc document.

For iframe elements in HTML documents, the attribute, if present, must have a value using the HTML syntax that consists of the following syntactic components, in the given order:

  1. Any number of comments and space characters.
  2. Optionally, a DOCTYPE.
  3. Any number of comments and space characters.
  4. The root element, in the form of an html element.
  5. Any number of comments and space characters.

For iframe elements in XML documents, the attribute, if present, must have a value that matches the production labeled document in the XML specification. [XML]

If the src attribute and the srcdoc attribute are both specified together, the srcdoc attribute takes priority. This allows authors to provide a fallback URL for legacy user agents that do not support the srcdoc attribute.

When an iframe element is first inserted into a document, the user agent must create a nested browsing context, and then process the iframe attributes for the first time.

Whenever an iframe element with a nested browsing context has its srcdoc attribute set, changed, or removed, the user agent must process the iframe attributes.

Similarly, whenever an iframe element with a nested browsing context but with no srcdoc attribute specified has its src attribute set, changed, or removed, the user agent must process the iframe attributes.

When the user agent is to process the iframe attributes, it must run the first appropriate steps from the following list:

If the srcdoc attribute is specified

Navigate the element's browsing context to a resource whose Content-Type is text/html, whose URL is about:srcdoc, and whose data consists of the value of the attribute. The resulting Document must be considered an iframe srcdoc document.

If the src attribute is specified but the srcdoc attribute is not
  1. If the value of the src attribute is the empty string, jump to the empty step below.

  2. Resolve the value of the src attribute, relative to the iframe element.

  3. If that is not successful, then jump to the empty step below.

  4. If the resulting absolute URL is an ASCII case-insensitive match for the string "about:blank", and the user agent is processing this iframe's attributes for the first time, then jump to the empty step below. (In cases other than the first time, about:blank is loaded normally.)

  5. Navigate the element's browsing context to the resulting absolute URL.

Empty: When the steps above require the user agent to jump to the empty step, if the user agent is processing this iframe's attributes for the first time, then the user agent must queue a task to fire a simple event named load at the iframe element. (After jumping to this step, the above steps are not resumed.)

Otherwise

Queue a task to fire a simple event named load at the iframe element.

Any navigation required of the user agent in the process the iframe attributes algorithm must be completed with the iframe element's document's browsing context as the source browsing context.

Furthermore, if the browsing context's session history contained only one Document when the process the iframe attributes algorithm was invoked, and that was the about:blank Document created when the browsing context was created, then any navigation required of the user agent in that algorithm must be completed with replacement enabled.

If, when the element is created, the srcdoc attribute is not set, and the src attribute is either also not set or set but its value cannot be resolved, the browsing context will remain at the initial about:blank page.

If the user navigates away from this page, the iframe's corresponding WindowProxy object will proxy new Window objects for new Document objects, but the src attribute will not change.

Removing an iframe from a Document does not cause its browsing context to be discarded. Indeed, an iframe's browsing context can survive its original parent Document if its iframe is moved to another Document.

On the other hand, if an iframe is removed from a Document and is then subsequently garbage collected, this will likely mean (in the absence of other references) that the child browsing context's WindowProxy object will become eligble for garbage collection, which will then lead to that browsing context being discarded, which will then lead to its Document being discarded also. This happens without notice to any scripts running in that Document; for example, no unload events are fired (the "unload a document" steps are not run).

Here a blog uses the srcdoc attribute in conjunction with the sandbox and seamless attributes described below to provide users of user agents that support this feature with an extra layer of protection from script injection in the blog post comments:

<article>
 <h1>I got my own magazine!</h1>
 <p>After much effort, I've finally found a publisher, and so now I
 have my own magazine! Isn't that awesome?! The first issue will come
 out in September, and we have articles about getting food, and about
 getting in boxes, it's going to be great!</p>
 <footer>
  <p>Written by <a href="/users/cap">cap</a>.
  <time pubdate>2009-08-21T23:32Z</time></p>
 </footer>
 <article>
  <footer> At <time pubdate>2009-08-21T23:35Z</time>, <a href="/users/ch">ch</a> writes: </footer>
  <iframe seamless sandbox srcdoc="<p>did you get a cover picture yet?"></iframe>
 </article>
 <article>
  <footer> At <time pubdate>2009-08-21T23:44Z</time>, <a href="/users/cap">cap</a> writes: </footer>
  <iframe seamless sandbox srcdoc="<p>Yeah, you can see it <a href=&quot;/gallery?mode=cover&amp;amp;page=1&quot;>in my gallery</a>."></iframe>
 </article>
 <article>
  <footer> At <time pubdate>2009-08-21T23:58Z</time>, <a href="/users/ch">ch</a> writes: </footer>
  <iframe seamless sandbox srcdoc="<p>hey that's earl's table.
<p>you should get earl&amp;amp;me on the next cover."></iframe>
 </article>

Notice the way that quotes have to be escaped (otherwise the sandbox attribute would end prematurely), and the way raw ampersands (e.g. in URLs or in prose) mentioned in the sandboxed content have to be doubly escaped — once so that the ampersand is preserved when originally parsing the sandbox attribute, and once more to prevent the ampersand from being misinterpreted when parsing the sandboxed content.

In the HTML syntax, authors need only remember to use U+0022 QUOTATION MARK characters (") to wrap the attribute contents and then to escape all U+0022 QUOTATION MARK (") and U+0026 AMPERSAND (&) characters, and to specify the sandbox attribute, to ensure safe embedding of content.

Due to restrictions of the XML syntax, in XML the U+003C LESS-THAN SIGN character (<) needs to be escaped as well. In order to prevent attribute-value normalization, some of XML's whitespace characters — specifically U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), and U+000D CARRIAGE RETURN (CR) — also need to be escaped. [XML]


The name attribute, if present, must be a valid browsing context name. The given value is used to name the nested browsing context. When the browsing context is created, if the attribute is present, the browsing context name must be set to the value of this attribute; otherwise, the browsing context name must be set to the empty string.

Whenever the name attribute is set, the nested browsing context's name must be changed to the new value. If the attribute is removed, the browsing context name must be set to the empty string.

When content loads in an iframe, after any load events are fired within the content itself, the user agent must queue a task to fire a simple event named load at the iframe element. When content whose URL has the same origin as the iframe element's Document fails to load (e.g. due to a DNS error, network error, or if the server returned a 4xx or 5xx status code or equivalent), then the user agent must queue a task to fire a simple event named error at the element instead. (This event does not fire for parse errors, script errors, or any errors for cross-origin resources.)

The task source for these tasks is the DOM manipulation task source.

A load event is also fired at the iframe element when it is created if no other data is loaded in it.

When there is an active parser in the iframe, and when anything in the iframe is delaying the load event of the iframe's browsing context's active document, the iframe must delay the load event of its document.

If, during the handling of the load event, the browsing context in the iframe is again navigated, that will further delay the load event.


The sandbox attribute, when specified, enables a set of extra restrictions on any content hosted by the iframe. Its value must be an unordered set of unique space-separated tokens that are ASCII case-insensitive. The allowed values are allow-same-origin, allow-top-navigation, allow-forms, and allow-scripts. When the attribute is set, the content is treated as being from a unique origin, forms and scripts are disabled, links are prevented from targeting other browsing contexts, and plugins are disabled. The allow-same-origin keyword allows the content to be treated as being from the same origin instead of forcing it into a unique origin, the allow-top-navigation keyword allows the content to navigate its top-level browsing context, and the allow-forms and allow-scripts keywords re-enable forms and scripts respectively (though scripts are still prevented from creating popups).

Setting both the allow-scripts and allow-same-origin keywords together when the embedded page has the same origin as the page containing the iframe allows the embedded page to simply remove the sandbox attribute.

Sandboxing hostile content is of minimal help if an attacker can convince the user to just visit the hostile content directly, rather than in the iframe. To limit the damage that can be caused by hostile HTML content, it should be served using the text/html-sandboxed MIME type.

While the sandbox attribute is specified, the iframe element's nested browsing context must have the flags given in the following list set. In addition, any browsing contexts nested within an iframe, either directly or indirectly, must have all the flags set on them as were set on the iframe's Document's browsing context when the iframe's Document was created.

The sandboxed navigation browsing context flag

This flag prevents content from navigating browsing contexts other than the sandboxed browsing context itself (or browsing contexts further nested inside it), and the top-level browsing context (which is protected by the sandboxed top-level navigation browsing context flag defined next).

This flag also prevents content from creating new auxiliary browsing contexts, e.g. using the target attribute or the window.open() method.

The sandboxed top-level navigation browsing context flag, unless the sandbox attribute's value, when split on spaces, is found to have the allow-top-navigation keyword set

This flag prevents content from navigating their top-level browsing context.

When the allow-top-navigation is set, content can navigate its top-level browsing context, but other browsing contexts are still protected by the sandboxed navigation browsing context flag defined above.

The sandboxed plugins browsing context flag

This flag prevents content from instantiating plugins, whether using the embed element, the object element, the applet element, or through navigation of a nested browsing context.

The sandboxed seamless iframes flag

This flag prevents content from using the seamless attribute on descendant iframe elements.

This prevents a page inserted using the allow-same-origin keyword from using a CSS-selector-based method of probing the DOM of other pages on the same site (in particular, pages that contain user-sensitive information).

The sandboxed origin browsing context flag, unless the sandbox attribute's value, when split on spaces, is found to have the allow-same-origin keyword set

This flag forces content into a unique origin, thus preventing it from accessing other content from the same origin.

This flag also prevents script from reading from or writing to the document.cookie IDL attribute, and blocks access to localStorage. [WEBSTORAGE]

The allow-same-origin attribute is intended for two cases.

First, it can be used to allow content from the same site to be sandboxed to disable scripting, while still allowing access to the DOM of the sandboxed content.

Second, it can be used to embed content from a third-party site, sandboxed to prevent that site from opening popup windows, etc, without preventing the embedded page from communicating back to its originating site, using the database APIs to store data, etc.

The sandboxed forms browsing context flag, unless the sandbox attribute's value, when split on spaces, is found to have the allow-forms keyword set

This flag blocks form submission.

The sandboxed scripts browsing context flag, unless the sandbox attribute's value, when split on spaces, is found to have the allow-scripts keyword set

This flag blocks script execution.

The sandboxed automatic features browsing context flag, unless the sandbox attribute's value, when split on spaces, is found to have the allow-scripts keyword (defined above) set

This flag blocks features that trigger automatically, such as automatically playing a video or automatically focusing a form control. It is relaxed by the same flag as scripts, because when scripts are enabled these features are trivially possible anyway, and it would be unfortunate to force authors to use script to do them when sandboxed rather than allowing them to use the declarative features.

These flags must not be set unless the conditions listed above define them as being set.

These flags only take effect when the nested browsing context of the iframe is navigated. Removing them, or removing the entire sandbox attribute, has no effect on an already-loaded page.

In this example, some completely-unknown, potentially hostile, user-provided HTML content is embedded in a page. Because it is sandboxed, it is treated by the user agent as being from a unique origin, despite the content being served from the same site. Thus it is affected by all the normal cross-site restrictions. In addition, the embedded page has scripting disabled, plugins disabled, forms disabled, and it cannot navigate any frames or windows other than itself (or any frames or windows it itself embeds).

<p>We're not scared of you! Here is your content, unedited:</p>
<iframe sandbox src="getusercontent.cgi?id=12193"></iframe>

Note that cookies are still sent to the server in the getusercontent.cgi request, though they are not visible in the document.cookie IDL attribute.

It is important that the server serve the user-provided HTML using the text/html-sandboxed MIME type so that if the attacker convinces the user to visit that page directly, the page doesn't run in the context of the site's origin, which would make the user vulnerable to any attack found in the page.

In this example, a gadget from another site is embedded. The gadget has scripting and forms enabled, and the origin sandbox restrictions are lifted, allowing the gadget to communicate with its originating server. The sandbox is still useful, however, as it disables plugins and popups, thus reducing the risk of the user being exposed to malware and other annoyances.

<iframe sandbox="allow-same-origin allow-forms allow-scripts"
        src="http://maps.example.com/embedded.html"></iframe>

Suppose a file A contained the following fragment:

<iframe sandbox="allow-same-origin allow-forms" src=B></iframe>

Suppose that file B contained an iframe also:

<iframe sandbox="allow-scripts" src=C></iframe>

Further, suppose that file C contained a link:

<a href=D>Link</a>

For this example, suppose all the files were served as text/html.

Page C in this scenario has all the sandboxing flags set. Scripts are disabled, because the iframe in A has scripts disabled, and this overrides the allow-scripts keyword set on the iframe in B. Forms are also disabled, because the inner iframe (in B) does not have the allow-forms keyword set.

Suppose now that a script in A removes all the sandbox attributes in A and B. This would change nothing immediately. If the user clicked the link in C, loading page D into the iframe in B, page D would now act as if the iframe in B had the allow-same-origin and allow-forms keywords set, because that was the state of the nested browsing context in the iframe in A when page B was loaded.

Generally speaking, dynamically removing or changing the sandbox attribute is ill-advised, because it can make it quite hard to reason about what will be allowed and what will not.

Potentially hostile files can be served from the same server as the file containing the iframe element by labeling them as text/html-sandboxed instead of text/html. This ensures that scripts in the files are unable to attack the site (as if they were actually served from another server), even if the user is tricked into visiting those pages directly, without the protection of the sandbox attribute.

If the allow-scripts keyword is set along with allow-same-origin keyword, and the file is from the same origin as the iframe's Document, then a script in the "sandboxed" iframe could just reach out, remove the sandbox attribute, and then reload itself, effectively breaking out of the sandbox altogether.


The seamless attribute is a boolean attribute. When specified, it indicates that the iframe element's browsing context is to be rendered in a manner that makes it appear to be part of the containing document (seamlessly included in the parent document). Specifically, when the attribute is set on an iframe element whose owner Document's browsing context did not have the sandboxed seamless iframes flag set when that Document was created, and while either the browsing context's active document has the same origin as the iframe element's document, or the browsing context's active document's address has the same origin as the iframe element's document, or the browsing context's active document is an iframe srcdoc document, the following requirements apply:

If the attribute is not specified, or if the origin conditions listed above are not met, then the user agent should render the nested browsing context in a manner that is clearly distinguishable as a separate browsing context, and the seamless browsing context flag must be set to false for that browsing context.

It is important that user agents recheck the above conditions whenever the active document of the nested browsing context of the iframe changes, such that the seamless browsing context flag gets unset if the nested browsing context is navigated to another origin.

The attribute can be set or removed dynamically, with the rendering updating in tandem.

In this example, the site's navigation is embedded using a client-side include using an iframe. Any links in the iframe will, in new user agents, be automatically opened in the iframe's parent browsing context; for legacy user agents, the site could also include a base element with a target attribute with the value _parent. Similarly, in new user agents the styles of the parent page will be automatically applied to the contents of the frame, but to support legacy user agents authors might wish to include the styles explicitly.

<nav><iframe seamless src="nav.include.html"></iframe></nav>

The iframe element supports dimension attributes for cases where the embedded content has specific dimensions (e.g. ad units have well-defined dimensions).

An iframe element never has fallback content, as it will always create a nested browsing context, regardless of whether the specified initial contents are successfully used.

Descendants of iframe elements represent nothing. (In legacy user agents that do not support iframe elements, the contents would be parsed as markup that could act as fallback content.)

When used in HTML documents, the allowed content model of iframe elements is text, except that invoking the HTML fragment parsing algorithm with the iframe element as the context element and the text contents as the input must result in a list of nodes that are all phrasing content, with no parse errors having occurred, with no script elements being anywhere in the list or as descendants of elements in the list, and with all the elements in the list (including their descendants) being themselves conforming.

The iframe element must be empty in XML documents.

The HTML parser treats markup inside iframe elements as text.

The IDL attributes src, srcdoc, name, sandbox, and seamless must reflect the respective content attributes of the same name.

The contentDocument IDL attribute must return the Document object of the active document of the iframe element's nested browsing context.

The contentWindow IDL attribute must return the WindowProxy object of the iframe element's nested browsing context.

Here is an example of a page using an iframe to include advertising from an advertising broker:

<iframe src="http://ads.example.com/?customerid=923513721&amp;format=banner"
        width="468" height="60"></iframe>

4.8.3 The embed element

Categories
Flow content.
Phrasing content.
Embedded content.
Interactive content.
Contexts in which this element can be used:
Where embedded content is expected.
Content model:
Empty.
Content attributes:
Global attributes
src
type
width
height
Any other attribute that has no namespace (see prose).
DOM interface:
interface HTMLEmbedElement : HTMLElement {
           attribute DOMString src;
           attribute DOMString type;
           attribute DOMString width;
           attribute DOMString height;
};

Depending on the type of content instantiated by the embed element, the node may also support other interfaces.

The embed element represents an integration point for an external (typically non-HTML) application or interactive content.

The src attribute gives the address of the resource being embedded. The attribute, if present, must contain a valid non-empty URL potentially surrounded by spaces.

The type attribute, if present, gives the MIME type by which the plugin to instantiate is selected. The value must be a valid MIME type. If both the type attribute and the src attribute are present, then the type attribute must specify the same type as the explicit Content-Type metadata of the resource given by the src attribute.

When the element is created with neither a src attribute nor a type attribute, and when attributes are removed such that neither attribute is present on the element anymore, and when the element has a media element ancestor, and when the element has an ancestor object element that is not showing its fallback content, any plugins instantiated for the element must be removed, and the embed element represents nothing.

If either:

...then the user agent must render the embed element in a manner that conveys that the plugin was disabled. The user agent may offer the user the option to override the sandbox and instantiate the plugin anyway; if the user invokes such an option, the user agent must act as if the conditions above did not apply for the purposes of this element.

Plugins are disabled in sandboxed browsing contexts because they might not honor the restrictions imposed by the sandbox (e.g. they might allow scripting even when scripting in the sandbox is disabled). User agents should convey the danger of overriding the sandbox to the user if an option to do so is provided.

An embed element is said to be potentially active when the following conditions are all met simultaneously:

Whenever an embed element that was not potentially active becomes potentially active, and whenever a potentially active embed element's src attribute is set, changed, or removed, and whenever a potentially active embed element's type attribute is set, changed, or removed, the appropriate set of steps from the following is then applied:

If the element has a src attribute set

The user agent must resolve the value of the element's src attribute, relative to the element. If that is successful, the user agent should fetch the resulting absolute URL, from the element's browsing context scope origin if it has one. The task that is queued by the networking task source once the resource has been fetched must find and instantiate an appropriate plugin based on the content's type, and hand that plugin the content of the resource, replacing any previously instantiated plugin for the element.

Fetching the resource must delay the load event of the element's document.

If the element has no src attribute set

The user agent should find and instantiate an appropriate plugin based on the value of the type attribute.

Whenever an embed element that was potentially active stops being potentially active, any plugin that had been instantiated for that element must be unloaded.

The embed element is unaffected by the CSS 'display' property. The selected plugin is instantiated even if the element is hidden with a 'display:none' CSS style.

The type of the content being embedded is defined as follows:

  1. If the element has a type attribute, and that attribute's value is a type that a plugin supports, then the value of the type attribute is the content's type.

  2. Otherwise, if the <path> component of the URL of the specified resource (after any redirects) matches a pattern that a plugin supports, then the content's type is the type that that plugin can handle.

    For example, a plugin might say that it can handle resources with <path> components that end with the four character string ".swf".

  3. Otherwise, if the specified resource has explicit Content-Type metadata, then that is the content's type.

  4. Otherwise, the content has no type and there can be no appropriate plugin for it.

The embed element has no fallback content. If the user agent can't find a suitable plugin, then the user agent must use a default plugin. (This default could be as simple as saying "Unsupported Format".)

Whether the resource is fetched successfully or not (e.g. whether the response code was a 2xx code or equivalent) must be ignored when determining the resource's type and when handing the resource to the plugin.

This allows servers to return data for plugins even with error responses (e.g. HTTP 500 Internal Server Error codes can still contain plugin data).

Any namespace-less attribute other than name, align, hspace, and vspace may be specified on the embed element, so long as its name is XML-compatible and contains no characters in the range U+0041 to U+005A (LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z). These attributes are then passed as parameters to the plugin.

All attributes in HTML documents get lowercased automatically, so the restriction on uppercase letters doesn't affect such documents.

The four exceptions are to exclude legacy attributes that have side-effects beyond just sending parameters to the plugin.

The user agent should pass the names and values of all the attributes of the embed element that have no namespace to the plugin used, when it is instantiated.

If the plugin instantiated for the embed element supports a scriptable interface, the HTMLEmbedElement object representing the element should expose that interface while the element is instantiated.

The embed element supports dimension attributes.

The IDL attributes src and type each must reflect the respective content attributes of the same name.

Here's a way to embed a resource that requires a proprietary plug-in, like Flash:

<embed src="catgame.swf">

If the user does not have the plug-in (for example if the plug-in vendor doesn't support the user's platform), then the user will be unable to use the resource.

To pass the plugin a parameter "quality" with the value "high", an attribute can be specified:

<embed src="catgame.swf" quality="high">

This would be equivalent to the following, when using an object element instead:

<object data="catgame.swf">
 <param name="quality" value="high">
</object>

4.8.4 The object element

Categories
Flow content.
Phrasing content.
Embedded content.
If the element has a usemap attribute: Interactive content.
Listed, submittable, form-associated element.
Contexts in which this element can be used:
Where embedded content is expected.
Content model:
Zero or more param elements, then, transparent.
Content attributes:
Global attributes
data
type
name
usemap
form
width
height
DOM interface:
interface HTMLObjectElement : HTMLElement {
           attribute DOMString data;
           attribute DOMString type;
           attribute DOMString name;
           attribute DOMString useMap;
  readonly attribute HTMLFormElement form;
           attribute DOMString width;
           attribute DOMString height;
  readonly attribute Document contentDocument;
  readonly attribute WindowProxy contentWindow;

  readonly attribute boolean willValidate;
  readonly attribute ValidityState validity;
  readonly attribute DOMString validationMessage;
  boolean checkValidity();
  void setCustomValidity(in DOMString error);
};

Depending on the type of content instantiated by the object element, the node also supports other interfaces.

The object element can represent an external resource, which, depending on the type of the resource, will either be treated as an image, as a nested browsing context, or as an external resource to be processed by a plugin.

The data attribute, if present, specifies the address of the resource. If present, the attribute must be a valid non-empty URL potentially surrounded by spaces.

The type attribute, if present, specifies the type of the resource. If present, the attribute must be a valid MIME type.

At least one of either the data attribute or the type attribute must be present.

The name attribute, if present, must be a valid browsing context name. The given value is used to name the nested browsing context, if applicable.

When the element is created, when it is popped off the stack of open elements of an HTML parser or XML parser, and subsequently whenever the element is inserted into a document or removed from a document; and whenever the element's Document changes whether it is fully active; and whenever an ancestor object element changes to or from showing its fallback content; and whenever the element's classid attribute is set, changed, or removed; and, when its classid attribute is not present, whenever its data attribute is set, changed, or removed; and, when neither its classid attribute nor its data attribute are present, whenever its type attribute is set, changed, or removed: the user agent must queue a task to run the following steps to (re)determine what the object element represents. The task source for this task is the DOM manipulation task source.

  1. If the user has indicated a preference that this object element's fallback content be shown instead of the element's usual behavior, then jump to the last step in the overall set of steps (fallback).

    For example, a user could ask for the element's fallback content to be shown because that content uses a format that the user finds more accessible.

  2. If the element has an ancestor media element, or has an ancestor object element that is not showing its fallback content, or if the element is not in a Document with a browsing context, or if the element's Document is not fully active, or if the element is still in the stack of open elements of an HTML parser or XML parser, then jump to the last step in the overall set of steps (fallback).

  3. If the classid attribute is present, and has a value that isn't the empty string, then: if the user agent can find a plugin suitable according to the value of the classid attribute, and plugins aren't being sandboxed, then that plugin should be used, and the value of the data attribute, if any, should be passed to the plugin. If no suitable plugin can be found, or if the plugin reports an error, jump to the last step in the overall set of steps (fallback).

  4. If the data attribute is present and its value is not the empty string, then:

    1. If the type attribute is present and its value is not a type that the user agent supports, and is not a type that the user agent can find a plugin for, then the user agent may jump to the last step in the overall set of steps (fallback) without fetching the content to examine its real type.

    2. Resolve the URL specified by the data attribute, relative to the element.

    3. If that failed, fire a simple event named error at the element, then jump to the last step in the overall set of steps (fallback).

    4. Fetch the resulting absolute URL, from the element's browsing context scope origin if it has one.

      Fetching the resource must delay the load event of the element's document until the task that is queued by the networking task source once the resource has been fetched (defined next) has been run.

      For the purposes of the application cache networking model, this fetch operation is not for a child browsing context (though it might end up being used for one after all, as defined below).

    5. If the resource is not yet available (e.g. because the resource was not available in the cache, so that loading the resource required making a request over the network), then jump to the last step in the overall set of steps (fallback). The task that is queued by the networking task source once the resource is available must restart this algorithm from this step. Resources can load incrementally; user agents may opt to consider a resource "available" whenever enough data has been obtained to begin processing the resource.

    6. If the load failed (e.g. there was an HTTP 404 error, there was a DNS error), fire a simple event named error at the element, then jump to the last step in the overall set of steps (fallback).

    7. Determine the resource type, as follows:

      1. Let the resource type be unknown.

      2. If the user agent is configured to strictly obey Content-Type headers for this resource, and the resource has associated Content-Type metadata, then let the resource type be the type specified in the resource's Content-Type metadata, and jump to the step below labeled handler.

      3. If there is a type attribute present on the object element, and that attribute's value is not a type that the user agent supports, but it is a type that a plugin supports, then let the resource type be the type specified in that type attribute, and jump to the step below labeled handler.

      4. Run the approprate set of steps from the following list:

        The resource has associated Content-Type metadata
        1. Let binary be false.

        2. If the type specified in the resource's Content-Type metadata is "text/plain", and the result of applying the rules for distinguishing if a resource is text or binary to the resource is that the resource is not text/plain, then set binary to true.

        3. If the type specified in the resource's Content-Type metadata is "application/octet-stream", then set binary to true.

        4. If binary is false, then let the resource type be the type specified in the resource's Content-Type metadata, and jump to the step below labeled handler.

        5. If there is a type attribute present on the object element, and its value is not application/octet-stream, then run the following steps:

          1. If the attribute's value is a type that a plugin supports, or the attribute's value is a type that starts with "image/" that is not also an XML MIME type, then let the resource type be the type specified in that type attribute.

          2. Jump to the step below labeled handler.

        The resource does not have associated Content-Type metadata
        1. If there is a type attribute present on the object element, then let the tentative type be the type specified in that type attribute.

          Otherwise, let tentative type be the sniffed type of the resource.

        2. If tentative type is not application/octet-stream, then let resource type be tentative type and jump to the step below labeled handler.

      5. If the <path> component of the URL of the specified resource (after any redirects) matches a pattern that a plugin supports, then let resource type be the type that that plugin can handle.

        For example, a plugin might say that it can handle resources with <path> components that end with the four character string ".swf".

      It is possible for this step to finish with resource type still being unknown, or for one of the substeps above to jump straight to the next step. In both cases, the next step will trigger fallback.

    8. Handler: Handle the content as given by the first of the following cases that matches:

      If the resource type is not a type that the user agent supports, but it is a type that a plugin supports

      If plugins are being sandboxed, jump to the last step in the overall set of steps (fallback).

      Otherwise, the user agent should use the plugin that supports resource type and pass the content of the resource to that plugin. If the plugin reports an error, then jump to the last step in the overall set of steps (fallback).

      If the resource type is an XML MIME type, or if the resource type does not start with "image/"

      The object element must be associated with a newly created nested browsing context, if it does not already have one.

      If the URL of the given resource is not about:blank, the element's nested browsing context must then be navigated to that resource, with replacement enabled, and with the object element's document's browsing context as the source browsing context. (The data attribute of the object element doesn't get updated if the browsing context gets further navigated to other locations.)

      If the URL of the given resource is about:blank, then, instead, the user agent must queue a task to fire a simple event named load at the object element.

      The object element represents the nested browsing context.

      If the name attribute is present, the browsing context name must be set to the value of this attribute; otherwise, the browsing context name must be set to the empty string.

      In certain situations, e.g. if the resource was fetched from an application cache but it is an HTML file with a manifest attribute that points to a different application cache manifest, the navigation of the browsing context will be restarted so as to load the resource afresh from the network or a different application cache. Even if the resource is then found to have a different type, it is still used as part of a nested browsing context: only the navigate algorithm is restarted, not this object algorithm.

      If the resource type starts with "image/", and support for images has not been disabled

      Apply the image sniffing rules to determine the type of the image.

      The object element represents the specified image. The image is not a nested browsing context.

      If the image cannot be rendered, e.g. because it is malformed or in an unsupported format, jump to the last step in the overall set of steps (fallback).

      Otherwise

      The given resource type is not supported. Jump to the last step in the overall set of steps (fallback).

      If the previous step ended with the resource type being unknown, this is the case that is triggered.

    9. The element's contents are not part of what the object element represents.

    10. Once the resource is completely loaded, queue a task to fire a simple event named load at the element.

      The task source for this task is the DOM manipulation task source.

  5. If the data attribute is absent but the type attribute is present, plugins aren't being sandboxed, and the user agent can find a plugin suitable according to the value of the type attribute, then that plugin should be used. If no suitable plugin can be found, or if the plugin reports an error, jump to the next step (fallback).

  6. (Fallback.) The object element represents the element's children, ignoring any leading param element children. This is the element's fallback content. If the element has an instantiated plugin, then unload it.

When the algorithm above instantiates a plugin, the user agent should pass to the plugin used the names and values of all the attributes on the element, in the order they were added to the element, with the attributes added by the parser being ordered in source order, followed by a parameter named "PARAM" whose value is null, followed by all the names and values of parameters given by param elements that are children of the object element, in tree order. If the plugin supports a scriptable interface, the HTMLObjectElement object representing the element should expose that interface. The object element represents the plugin. The plugin is not a nested browsing context.

If either:

...then the steps above must always act as if they had failed to find a plugin, even if one would otherwise have been used.

The above algorithm is independent of CSS properties (including 'display', 'overflow', and 'visibility'). For example, it runs even if the element is hidden with a 'display:none' CSS style, and does not run again if the element's visibility changes.

Due to the algorithm above, the contents of object elements act as fallback content, used only when referenced resources can't be shown (e.g. because it returned a 404 error). This allows multiple object elements to be nested inside each other, targeting multiple user agents with different capabilities, with the user agent picking the first one it supports.

Whenever the name attribute is set, if the object element has a nested browsing context, its name must be changed to the new value. If the attribute is removed, if the object element has a browsing context, the browsing context name must be set to the empty string.

The usemap attribute, if present while the object element represents an image, can indicate that the object has an associated image map. The attribute must be ignored if the object element doesn't represent an image.

The form attribute is used to explicitly associate the object element with its form owner.

Constraint validation: object elements are always barred from constraint validation.

The object element supports dimension attributes.

The IDL attributes data, type, name, and useMap each must reflect the respective content attributes of the same name.

The contentDocument IDL attribute must return the Document object of the active document of the object element's nested browsing context, if it has one; otherwise, it must return null.

The contentWindow IDL attribute must return the WindowProxy object of the object element's nested browsing context, if it has one; otherwise, it must return null.

The willValidate, validity, and validationMessage attributes, and the checkValidity() and setCustomValidity() methods, are part of the constraint validation API. The form IDL attribute is part of the element's forms API.

In the following example, a Java applet is embedded in a page using the object element. (Generally speaking, it is better to avoid using applets like these and instead use native JavaScript and HTML to provide the functionality, since that way the application will work on all Web browsers without requiring a third-party plugin. Many devices, especially embedded devices, do not support third-party technologies like Java.)

<figure>
 <object type="application/x-java-applet">
  <param name="code" value="MyJavaClass">
  <p>You do not have Java available, or it is disabled.</p>
 </object>
 <figcaption>My Java Clock</figcaption>
</figure>

In this example, an HTML page is embedded in another using the object element.

<figure>
 <object data="clock.html"></object>
 <figcaption>My HTML Clock</figcaption>
</figure>

The following example shows how a plugin can be used in HTML (in this case the Flash plugin, to show a video file). Fallback is provided for users who do not have Flash enabled, in this case using the video element to show the video for those using user agents that support video, and finally providing a link to the video for those who have neither Flash nor a video-capable browser.

<p>Look at my video:
 <object type="application/x-shockwave-flash">
  <param name=movie value="http://video.example.com/library/watch.swf">
  <param name=allowfullscreen value=true>
  <param name=flashvars value="http://video.example.com/vids/315981">
  <video controls src="http://video.example.com/vids/315981">
   <a href="http://video.example.com/vids/315981">View video</a>.
  </video>
 </object>
</p>

4.8.5 The param element

Categories
None.
Contexts in which this element can be used:
As a child of an object element, before any flow content.
Content model:
Empty.
Content attributes:
Global attributes
name
value
DOM interface:
interface HTMLParamElement : HTMLElement {
           attribute DOMString name;
           attribute DOMString value;
};

The param element defines parameters for plugins invoked by object elements. It does not represent anything on its own.

The name attribute gives the name of the parameter.

The value attribute gives the value of the parameter.

Both attributes must be present. They may have any value.

If both attributes are present, and if the parent element of the param is an object element, then the element defines a parameter with the given name/value pair.

If either the name or value of a parameter defined by a param element that is the child of an object element that represents an instantiated plugin changes, and if that plugin is communicating with the user agent using an API that features the ability to update the plugin when the name or value of a parameter so changes, then the user agent must appropriately exercise that ability to notify the plugin of the change.

The IDL attributes name and value must both reflect the respective content attributes of the same name.

The following example shows how the param element can be used to pass a parameter to a plugin, in this case the O3D plugin.

<!DOCTYPE HTML>
<html lang="en">
  <head>
   <title>O3D Utah Teapot</title>
  </head>
  <body>
   <p>
    <object type="application/vnd.o3d.auto">
     <param name="o3d_features" value="FloatingPointTextures">
     <img src="o3d-teapot.png"
          title="3D Utah Teapot illustration rendered using O3D."
          alt="When O3D renders the Utah Teapot, it appears as a squat
          teapot with a shiny metallic finish on which the
          surroundings are reflected, with a faint shadow caused by
          the lighting.">
     <p>To see the teapot actually rendered by O3D on your
     computer, please download and install the <a
     href="http://code.google.com/apis/o3d/docs/gettingstarted.html#install">O3D plugin</a>.</p>
    </object>
    <script src="o3d-teapot.js"></script>
   </p>
  </body>
</html>

4.8.6 The video element

Categories
Flow content.
Phrasing content.
Embedded content.
If the element has a controls attribute: Interactive content.
Contexts in which this element can be used:
Where embedded content is expected.
Content model:
If the element has a src attribute: zero or more track elements, then transparent, but with no media element descendants.
If the element does not have a src attribute: zero or more source elements, then zero or more track elements, then transparent, but with no media element descendants.
Content attributes:
Global attributes
src
poster
preload
autoplay
mediagroup
loop
muted
controls
width
height
DOM interface:
interface HTMLVideoElement : HTMLMediaElement {
           attribute unsigned long width;
           attribute unsigned long height;
  readonly attribute unsigned long videoWidth;
  readonly attribute unsigned long videoHeight;
           attribute DOMString poster;
};

A video element is used for playing videos or movies, and audio files with captions.

Content may be provided inside the video element. User agents should not show this content to the user; it is intended for older Web browsers which do not support video, so that legacy video plugins can be tried, or to show text to the users of these older browsers informing them of how to access the video contents.

In particular, this content is not intended to address accessibility concerns. To make video content accessible to the blind, deaf, and those with other physical or cognitive disabilities, authors are expected to provide alternative media streams and/or to embed accessibility aids (such as caption or subtitle tracks, audio description tracks, or sign-language overlays) into their media streams.

The video element is a media element whose media data is ostensibly video data, possibly with associated audio data.

The src, preload, autoplay, mediagroup, loop, muted, and controls attributes are the attributes common to all media elements.

The poster attribute gives the address of an image file that the user agent can show while no video data is available. The attribute, if present, must contain a valid non-empty URL potentially surrounded by spaces.

If the specified resource is to be used, then, when the element is created or when the poster attribute is set, changed, or removed, the user agent must run the following steps to determine the element's poster frame:

  1. If there is an existing instance of this algorithm running for this video element, abort that instance of this algorithm without changing the poster frame.

  2. If the poster attribute's value is the empty string or if the attribute is absent, then there is no poster frame; abort these steps.

  3. Resolve the poster attribute's value relative to the element. If this fails, then there is no poster frame; abort these steps.

  4. Fetch the resulting absolute URL, from the element's Document's origin. This must delay the load event of the element's document.

  5. If an image is thus obtained, the poster frame is that image. Otherwise, there is no poster frame.

The image given by the poster attribute, the poster frame, is intended to be a representative frame of the video (typically one of the first non-blank frames) that gives the user an idea of what the video is like.


When no video data is available (the element's readyState attribute is either HAVE_NOTHING, or HAVE_METADATA but no video data has yet been obtained at all, or the element's readyState attribute is any subsequent value but the media resource does not have a video channel), the video element represents either the poster frame, or nothing.

When a video element is paused and the current playback position is the first frame of video, the element represents either the frame of video corresponding to the current playback position or the poster frame, at the discretion of the user agent.

Notwithstanding the above, the poster frame should be preferred over nothing, but the poster frame should not be shown again after a frame of video has been shown.

When a video element is paused at any other position, and the media resource has a video channel, the element represents the frame of video corresponding to the current playback position, or, if that is not yet available (e.g. because the video is seeking or buffering), the last frame of the video to have been rendered.

When a video element whose media resource has a video channel is potentially playing, it represents the frame of video at the continuously increasing "current" position. When the current playback position changes such that the last frame rendered is no longer the frame corresponding to the current playback position in the video, the new frame must be rendered. Similarly, any audio associated with the media resource must, if played, be played synchronized with the current playback position, at the element's effective media volume.

When a video element whose media resource has a video channel is neither potentially playing nor paused (e.g. when seeking or stalled), the element represents the last frame of the video to have been rendered.

Which frame in a video stream corresponds to a particular playback position is defined by the video stream's format.

The video element also represents any text track cues whose text track cue active flag is set and whose text track is in the showing or showing by default modes.

In addition to the above, the user agent may provide messages to the user (such as "buffering", "no video loaded", "error", or more detailed information) by overlaying text or icons on the video or other areas of the element's playback area, or in another appropriate manner.

User agents that cannot render the video may instead make the element represent a link to an external video playback utility or to the video data itself.


video . videoWidth
video . videoHeight

These attributes return the intrinsic dimensions of the video, or zero if the dimensions are not known.

The intrinsic width and intrinsic height of the media resource are the dimensions of the resource in CSS pixels after taking into account the resource's dimensions, aspect ratio, clean aperture, resolution, and so forth, as defined for the format used by the resource. If an anamorphic format does not define how to apply the aspect ratio to the video data's dimensions to obtain the "correct" dimensions, then the user agent must apply the ratio by increasing one dimension and leaving the other unchanged.

The videoWidth IDL attribute must return the intrinsic width of the video in CSS pixels. The videoHeight IDL attribute must return the intrinsic height of the video in CSS pixels. If the element's readyState attribute is HAVE_NOTHING, then the attributes must return 0.

The video element supports dimension attributes.

In the absence of style rules to the contrary, video content should be rendered inside the element's playback area such that the video content is shown centered in the playback area at the largest possible size that fits completely within it, with the video content's aspect ratio being preserved. Thus, if the aspect ratio of the playback area does not match the aspect ratio of the video, the video will be shown letterboxed or pillarboxed. Areas of the element's playback area that do not contain the video represent nothing.

In user agents that implement CSS, the above requirement can be implemented by using the style rule suggested in the rendering section.

The intrinsic width of a video element's playback area is the intrinsic width of the video resource, if that is available; otherwise it is the intrinsic width of the poster frame, if that is available; otherwise it is 300 CSS pixels.

The intrinsic height of a video element's playback area is the intrinsic height of the video resource, if that is available; otherwise it is the intrinsic height of the poster frame, if that is available; otherwise it is 150 CSS pixels.


User agents should provide controls to enable or disable the display of closed captions, audio description tracks, and other additional data associated with the video stream, though such features should, again, not interfere with the page's normal rendering.

User agents may allow users to view the video content in manners more suitable to the user (e.g. full-screen or in an independent resizable window). As for the other user interface features, controls to enable this should not interfere with the page's normal rendering unless the user agent is exposing a user interface. In such an independent context, however, user agents may make full user interfaces visible, with, e.g., play, pause, seeking, and volume controls, even if the controls attribute is absent.

User agents may allow video playback to affect system features that could interfere with the user's experience; for example, user agents could disable screensavers while video playback is in progress.


The poster IDL attribute must reflect the poster content attribute.

This example shows how to detect when a video has failed to play correctly:

<script>
 function failed(e) {
   // video playback failed - show a message saying why
   switch (e.target.error.code) {
     case e.target.error.MEDIA_ERR_ABORTED:
       alert('You aborted the video playback.');
       break;
     case e.target.error.MEDIA_ERR_NETWORK:
       alert('A network error caused the video download to fail part-way.');
       break;
     case e.target.error.MEDIA_ERR_DECODE:
       alert('The video playback was aborted due to a corruption problem or because the video used features your browser did not support.');
       break;
     case e.target.error.MEDIA_ERR_SRC_NOT_SUPPORTED:
       alert('The video could not be loaded, either because the server or network failed or because the format is not supported.');
       break;
     default:
       alert('An unknown error occurred.');
       break;
   }
 }
</script>
<p><video src="tgif.vid" autoplay controls onerror="failed(event)"></video></p>
<p><a href="tgif.vid">Download the video file</a>.</p>

4.8.7 The audio element

Categories
Flow content.
Phrasing content.
Embedded content.
If the element has a controls attribute: Interactive content.
Contexts in which this element can be used:
Where embedded content is expected.
Content model:
If the element has a src attribute: zero or more track elements, then transparent, but with no media element descendants.
If the element does not have a src attribute: one or more source elements, then zero or more track elements, then transparent, but with no media element descendants.
Content attributes:
Global attributes
src
preload
autoplay
mediagroup
loop
muted
controls
DOM interface:
[NamedConstructor=Audio(),
 NamedConstructor=Audio(in DOMString src)]
interface HTMLAudioElement : HTMLMediaElement {};

An audio element represents a sound or audio stream.

Content may be provided inside the audio element. User agents should not show this content to the user; it is intended for older Web browsers which do not support audio, so that legacy audio plugins can be tried, or to show text to the users of these older browsers informing them of how to access the audio contents.

In particular, this content is not intended to address accessibility concerns. To make audio content accessible to the deaf or to those with other physical or cognitive disabilities, authors are expected to provide alternative media streams and/or to embed accessibility aids (such as transcriptions) into their media streams.

The audio element is a media element whose media data is ostensibly audio data.

The src, preload, autoplay, mediagroup, loop, muted, and controls attributes are the attributes common to all media elements.

When an audio element is potentially playing, it must have its audio data played synchronized with the current playback position, at the element's effective media volume.

When an audio element is not potentially playing, audio must not play for the element.

audio = new Audio( [ url ] )

Returns a new audio element, with the src attribute set to the value passed in the argument, if applicable.

Two constructors are provided for creating HTMLAudioElement objects (in addition to the factory methods from DOM Core such as createElement()): Audio() and Audio(src). When invoked as constructors, these must return a new HTMLAudioElement object (a new audio element). The element must have its preload attribute set to the literal value "auto". If the src argument is present, the object created must have its src content attribute set to the provided value, and the user agent must invoke the object's resource selection algorithm before returning. The element's document must be the active document of the browsing context of the Window object on which the interface object of the invoked constructor is found.

4.8.8 The source element

Categories
None.
Contexts in which this element can be used:
As a child of a media element, before any flow content or track elements.
Content model:
Empty.
Content attributes:
Global attributes
src
type
media
DOM interface:
interface HTMLSourceElement : HTMLElement {
           attribute DOMString src;
           attribute DOMString type;
           attribute DOMString media;
};

The source element allows authors to specify multiple alternative media resources for media elements. It does not represent anything on its own.

The src attribute gives the address of the media resource. The value must be a valid non-empty URL potentially surrounded by spaces. This attribute must be present.

Dynamically modifying a source element and its attribute when the element is already inserted in a video or audio element will have no effect. To change what is playing, either just use the src attribute on the media element directly, or call the load() method on the media element after manipulating the source elements.

The type attribute gives the type of the media resource, to help the user agent determine if it can play this media resource before fetching it. If specified, its value must be a valid MIME type. The codecs parameter, which certain MIME types define, might be necessary to specify exactly how the resource is encoded. [RFC4281]

The following list shows some examples of how to use the codecs= MIME parameter in the type attribute.

H.264 Constrained baseline profile video (main and extended video compatible) level 3 and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
H.264 Extended profile video (baseline-compatible) level 3 and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="avc1.58A01E, mp4a.40.2"'>
H.264 Main profile video level 3 and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="avc1.4D401E, mp4a.40.2"'>
H.264 'High' profile video (incompatible with main, baseline, or extended profiles) level 3 and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="avc1.64001E, mp4a.40.2"'>
MPEG-4 Visual Simple Profile Level 0 video and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="mp4v.20.8, mp4a.40.2"'>
MPEG-4 Advanced Simple Profile Level 0 video and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="mp4v.20.240, mp4a.40.2"'>
MPEG-4 Visual Simple Profile Level 0 video and AMR audio in 3GPP container
<source src='video.3gp' type='video/3gpp; codecs="mp4v.20.8, samr"'>
Theora video and Vorbis audio in Ogg container
<source src='video.ogv' type='video/ogg; codecs="theora, vorbis"'>
Theora video and Speex audio in Ogg container
<source src='video.ogv' type='video/ogg; codecs="theora, speex"'>
Vorbis audio alone in Ogg container
<source src='audio.ogg' type='audio/ogg; codecs=vorbis'>
Speex audio alone in Ogg container
<source src='audio.spx' type='audio/ogg; codecs=speex'>
FLAC audio alone in Ogg container
<source src='audio.oga' type='audio/ogg; codecs=flac'>
Dirac video and Vorbis audio in Ogg container
<source src='video.ogv' type='video/ogg; codecs="dirac, vorbis"'>

The media attribute gives the intended media type of the media resource, to help the user agent determine if this media resource is useful to the user before fetching it. Its value must be a valid media query.

The default, if the media attribute is omitted, is "all", meaning that by default the media resource is suitable for all media.

If a source element is inserted as a child of a media element that has no src attribute and whose networkState has the value NETWORK_EMPTY, the user agent must invoke the media element's resource selection algorithm.

The IDL attributes src, type, and media must reflect the respective content attributes of the same name.

If the author isn't sure if the user agents will all be able to render the media resources provided, the author can listen to the error event on the last source element and trigger fallback behavior:

<script>
 function fallback(video) {
   // replace <video> with its contents
   while (video.hasChildNodes()) {
     if (video.firstChild instanceof HTMLSourceElement)
       video.removeChild(video.firstChild);
     else
       video.parentNode.insertBefore(video.firstChild, video);
   }
   video.parentNode.removeChild(video);
 }
</script>
<video controls autoplay>
 <source src='video.mp4' type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
 <source src='video.ogv' type='video/ogg; codecs="theora, vorbis"'
         onerror="fallback(parentNode)">
 ...
</video>

4.8.9 The track element

Categories
None.
Contexts in which this element can be used:
As a child of a media element, before any flow content.
Content model:
Empty.
Content attributes:
Global attributes
kind
src
srclang
label
default
DOM interface:
interface HTMLTrackElement : HTMLElement {
           attribute DOMString kind;
           attribute DOMString src;
           attribute DOMString srclang;
           attribute DOMString label;
           attribute boolean default;

  readonly attribute TextTrack track;
};

The track element allows authors to specify explicit external timed text tracks for media elements. It does not represent anything on its own.

The kind attribute is an enumerated attribute. The following table lists the keywords defined for this attribute. The keyword given in the first cell of each row maps to the state given in the second cell.

Keyword State Brief description
subtitles Subtitles Transcription or translation of the dialogue, suitable for when the sound is available but not understood (e.g. because the user does not understand the language of the media resource's soundtrack). Displayed over the video.
captions Captions Transcription or translation of the dialogue, sound effects, relevant musical cues, and other relevant audio information, suitable for when the soundtrack is unavailable (e.g. because it is muted or because the user is deaf). Displayed over the video; labeled as appropriate for the hard-of-hearing.
descriptions Descriptions Textual descriptions of the video component of the media resource, intended for audio synthesis when the visual component is unavailable (e.g. because the user is interacting with the application without a screen while driving, or because the user is blind). Synthesized as separate audio track.
chapters Chapters Chapter titles, intended to be used for navigating the media resource. Displayed as an interactive list in the user agent's interface.
metadata Metadata Tracks intended for use from script. Not displayed by the user agent.

The attribute may be omitted. The missing value default is the subtitles state.

The src attribute gives the address of the text track data. The value must be a valid non-empty URL potentially surrounded by spaces. This attribute must be present.

If the element has a src attribute whose value is not the empty string and whose value, when the attribute was set, could be successfully resolved relative to the element, then the element's track URL is the resulting absolute URL. Otherwise, the element's track URL is the empty string.

The srclang attribute gives the language of the text track data. The value must be a valid BCP 47 language tag. This attribute must be present if the element's kind attribute is in the subtitles state. [BCP47]

If the element has a srclang attribute whose value is not the empty string, then the element's track language is the value of the attribute. Otherwise, the element has no track language.

The label attribute gives a user-readable title for the track. This title is used by user agents when listing subtitle, caption, and audio description tracks in their user interface.

The value of the label attribute, if the attribute is present, must not be the empty string. Furthermore, there must not be two track element children of the same media element whose kind attributes are in the same state, whose srclang attributes are both missing or have values that represent the same language, and whose label attributes are again both missing or both have the same value.

If the element has a label attribute whose value is not the empty string, then the element's track label is the value of the attribute. Otherwise, the element's track label is a user-agent defined string (e.g. the string "untitled" in the user's locale, or a value automatically generated from the other attributes).

The default attribute, if specified, indicates that the track is to be enabled if the user's preferences do not indicate that another track would be more appropriate. There must not be more than one track element with the same parent node with the default attribute specified.

track . track

Returns the TextTrack object corresponding to the text track of the track element.

The track IDL attribute must, on getting, return the track element's text track's corresponding TextTrack object.

The src, srclang, label, and default IDL attributes must reflect the respective content attributes of the same name. The kind IDL attribute must reflect the content attribute of the same name, limited to only known values.

This video has subtitles in several languages:

<video src="brave.webm">
 <track kind=subtitles src=brave.en.vtt srclang=en label="English">
 <track kind=captions src=brave.en.vtt srclang=en label="English for the Hard of Hearing">
 <track kind=subtitles src=brave.fr.vtt srclang=fr label="Français">
 <track kind=subtitles src=brave.de.vtt srclang=de label="Deutsch">
</video>

4.8.10 Media elements

Media elements (audio and video, in this specification) implement the following interface:

interface HTMLMediaElement : HTMLElement {

  // error state
  readonly attribute MediaError error;

  // network state
           attribute DOMString src;
  readonly attribute DOMString currentSrc;
  const unsigned short NETWORK_EMPTY = 0;
  const unsigned short NETWORK_IDLE = 1;
  const unsigned short NETWORK_LOADING = 2;
  const unsigned short NETWORK_NO_SOURCE = 3;
  readonly attribute unsigned short networkState;
           attribute DOMString preload;
  readonly attribute TimeRanges buffered;
  void load();
  DOMString canPlayType(in DOMString type);

  // ready state
  const unsigned short HAVE_NOTHING = 0;
  const unsigned short HAVE_METADATA = 1;
  const unsigned short HAVE_CURRENT_DATA = 2;
  const unsigned short HAVE_FUTURE_DATA = 3;
  const unsigned short HAVE_ENOUGH_DATA = 4;
  readonly attribute unsigned short readyState;
  readonly attribute boolean seeking;

  // playback state
           attribute double currentTime;
  readonly attribute double initialTime;
  readonly attribute double duration;
  readonly attribute Date startOffsetTime;
  readonly attribute boolean paused;
           attribute double defaultPlaybackRate;
           attribute double playbackRate;
  readonly attribute TimeRanges played;
  readonly attribute TimeRanges seekable;
  readonly attribute boolean ended;
           attribute boolean autoplay;
           attribute boolean loop;
  void play();
  void pause();

  // media controller
           attribute DOMString mediaGroup;
           attribute MediaController controller;

  // controls
           attribute boolean controls;
           attribute double volume;
           attribute boolean muted;
           attribute boolean defaultMuted;

  // tracks
  readonly attribute MultipleTrackList audioTracks;
  readonly attribute ExclusiveTrackList videoTracks;
  readonly attribute TextTrack[] textTracks;
  MutableTextTrack addTextTrack(in DOMString kind, in optional DOMString label, in optional DOMString language);
};

The media element attributes, src, preload, autoplay, mediagroup, loop, muted, and controls, apply to all media elements. They are defined in this section.

Media elements are used to present audio data, or video and audio data, to the user. This is referred to as media data in this section, since this section applies equally to media elements for audio or for video. The term media resource is used to refer to the complete set of media data, e.g. the complete video file, or complete audio file.

A media resource can have multiple audio and video tracks. For the purposes of a media element, the video data of the media resource is only that of the currently selected track (if any) given by the element's videoTracks attribute, and the audio data of the media resource is the result of mixing all the currently enabled tracks (if any) given by the element's audioTracks attribute.

Both audio and video elements can be used for both audio and video. The main difference between the two is simply that the audio element has no playback area for visual content (such as video or captions), whereas the video element does.

Except where otherwise specified, the task source for all the tasks queued in this section and its subsections is the media element event task source.

4.8.10.1 Error codes
media . error

Returns a MediaError object representing the current error state of the element.

Returns null if there is no error.

All media elements have an associated error status, which records the last error the element encountered since its resource selection algorithm was last invoked. The error attribute, on getting, must return the MediaError object created for this last error, or null if there has not been an error.

interface MediaError {
  const unsigned short MEDIA_ERR_ABORTED = 1;
  const unsigned short MEDIA_ERR_NETWORK = 2;
  const unsigned short MEDIA_ERR_DECODE = 3;
  const unsigned short MEDIA_ERR_SRC_NOT_SUPPORTED = 4;
  readonly attribute unsigned short code;
};
media . error . code

Returns the current error's error code, from the list below.

The code attribute of a MediaError object must return the code for the error, which must be one of the following:

MEDIA_ERR_ABORTED (numeric value 1)
The fetching process for the media resource was aborted by the user agent at the user's request.
MEDIA_ERR_NETWORK (numeric value 2)
A network error of some description caused the user agent to stop fetching the media resource, after the resource was established to be usable.
MEDIA_ERR_DECODE (numeric value 3)
An error of some description occurred while decoding the media resource, after the resource was established to be usable.
MEDIA_ERR_SRC_NOT_SUPPORTED (numeric value 4)
The media resource indicated by the src attribute was not suitable.
4.8.10.2 Location of the media resource

The src content attribute on media elements gives the address of the media resource (video, audio) to show. The attribute, if present, must contain a valid non-empty URL potentially surrounded by spaces.

If a src attribute of a media element is set or changed, the user agent must invoke the media element's media element load algorithm. (Removing the src attribute does not do this, even if there are source elements present.)

The src IDL attribute on media elements must reflect the content attribute of the same name.

media . currentSrc

Returns the address of the current media resource.

Returns the empty string when there is no media resource.

The currentSrc IDL attribute is initially the empty string. Its value is changed by the resource selection algorithm defined below.

There are two ways to specify a media resource, the src attribute, or source elements. The attribute overrides the elements.

4.8.10.3 MIME types

A media resource can be described in terms of its type, specifically a MIME type, in some cases with a codecs parameter. (Whether the codecs parameter is allowed or not depends on the MIME type.) [RFC4281]

Types are usually somewhat incomplete descriptions; for example "video/mpeg" doesn't say anything except what the container type is, and even a type like "video/mp4; codecs="avc1.42E01E, mp4a.40.2"" doesn't include information like the actual bitrate (only the maximum bitrate). Thus, given a type, a user agent can often only know whether it might be able to play media of that type (with varying levels of confidence), or whether it definitely cannot play media of that type.

A type that the user agent knows it cannot render is one that describes a resource that the user agent definitely does not support, for example because it doesn't recognize the container type, or it doesn't support the listed codecs.

The MIME type "application/octet-stream" with no parameters is never a type that the user agent knows it cannot render. User agents must treat that type as equivalent to the lack of any explicit Content-Type metadata when it is used to label a potential media resource.

"application/octet-stream" is special-cased here; if any parameter appears with it, it should be treated just like any other MIME type. This is a deviation from the rule that unknown MIME type parameters should be ignored.

media . canPlayType(type)

Returns the empty string (a negative response), "maybe", or "probably" based on how confident the user agent is that it can play media resources of the given type.

The canPlayType(type) method must return the empty string if type is a type that the user agent knows it cannot render or is the type "application/octet-stream"; it must return "probably" if the user agent is confident that the type represents a media resource that it can render if used in with this audio or video element; and it must return "maybe" otherwise. Implementors are encouraged to return "maybe" unless the type can be confidently established as being supported or not. Generally, a user agent should never return "probably" for a type that allows the codecs parameter if that parameter is not present.

This script tests to see if the user agent supports a (fictional) new format to dynamically decide whether to use a video element or a plugin:

<section id="video">
 <p><a href="playing-cats.nfv">Download video</a></p>
</section>
<script>
 var videoSection = document.getElementById('video');
 var videoElement = document.createElement('video');
 var support = videoElement.canPlayType('video/x-new-fictional-format;codecs="kittens,bunnies"');
 if (support != "probably" && "New Fictional Video Plug-in" in navigator.plugins) {
   // not confident of browser support
   // but we have a plugin
   // so use plugin instead
   videoElement = document.createElement("embed");
 } else if (support == "") {
   // no support from browser and no plugin
   // do nothing
   videoElement = null;
 }
 if (videoElement) {
   while (videoSection.hasChildNodes())
     videoSection.removeChild(videoSection.firstChild);
   videoElement.setAttribute("src", "playing-cats.nfv");
   videoSection.appendChild(videoElement);
 }
</script>

The type attribute of the source element allows the user agent to avoid downloading resources that use formats it cannot render.

4.8.10.4 Network states
media . networkState

Returns the current state of network activity for the element, from the codes in the list below.

As media elements interact with the network, their current network activity is represented by the networkState attribute. On getting, it must return the current network state of the element, which must be one of the following values:

NETWORK_EMPTY (numeric value 0)
The element has not yet been initialized. All attributes are in their initial states.
NETWORK_IDLE (numeric value 1)
The element's resource selection algorithm is active and has selected a resource, but it is not actually using the network at this time.
NETWORK_LOADING (numeric value 2)
The user agent is actively trying to download data.
NETWORK_NO_SOURCE (numeric value 3)
The element's resource selection algorithm is active, but it has not yet found a resource to use.

The resource selection algorithm defined below describes exactly when the networkState attribute changes value and what events fire to indicate changes in this state.

4.8.10.5 Loading the media resource
media . load()

Causes the element to reset and start selecting and loading a new media resource from scratch.

All media elements have an autoplaying flag, which must begin in the true state, and a delaying-the-load-event flag, which must begin in the false state. While the delaying-the-load-event flag is true, the element must delay the load event of its document.

When the load() method on a media element is invoked, the user agent must run the media element load algorithm.

The media element load algorithm consists of the following steps.

  1. Abort any already-running instance of the resource selection algorithm for this element.

  2. If there are any tasks from the media element's media element event task source in one of the task queues, then remove those tasks.

    Basically, pending events and callbacks for the media element are discarded when the media element starts loading a new resource.

  3. If the media element's networkState is set to NETWORK_LOADING or NETWORK_IDLE, queue a task to fire a simple event named abort at the media element.

  4. If the media element's networkState is not set to NETWORK_EMPTY, then run these substeps:

    1. Queue a task to fire a simple event named emptied at the media element.

    2. If a fetching process is in progress for the media element, the user agent should stop it.

    3. Set the networkState attribute to NETWORK_EMPTY.

    4. Forget the media element's media-resource-specific text tracks.

    5. If readyState is not set to HAVE_NOTHING, then set it to that state.

    6. If the paused attribute is false, then set it to true.

    7. If seeking is true, set it to false.

    8. Set the current playback position to 0.

      If this changed the current playback position, then queue a task to fire a simple event named timeupdate at the media element.

    9. Set the initial playback position to 0.

    10. Set the timeline offset to Not-a-Number (NaN).

    11. Update the duration attribute to Not-a-Number (NaN).

      The user agent will not fire a durationchange event for this particular change of the duration.

  5. Set the playbackRate attribute to the value of the defaultPlaybackRate attribute.

  6. Set the error attribute to null and the autoplaying flag to true.

  7. Invoke the media element's resource selection algorithm.

  8. Playback of any previously playing media resource for this element stops.

The resource selection algorithm for a media element is as follows. This algorithm is always invoked synchronously, but one of the first steps in the algorithm is to return and continue running the remaining steps asynchronously, meaning that it runs in the background with scripts and other tasks running in parallel. In addition, this algorithm interacts closely with the event loop mechanism; in particular, it has synchronous sections (which are triggered as part of the event loop algorithm). Steps in such sections are marked with ⌛.

  1. Set the networkState to NETWORK_NO_SOURCE.

  2. Asynchronously await a stable state, allowing the task that invoked this algorithm to continue. The synchronous section consists of all the remaining steps of this algorithm until the algorithm says the synchronous section has ended. (Steps in synchronous sections are marked with ⌛.)

  3. ⌛ If the media element has a src attribute, then let mode be attribute.

    ⌛ Otherwise, if the media element does not have a src attribute but has a source element child, then let mode be children and let candidate be the first such source element child in tree order.

    ⌛ Otherwise the media element has neither a src attribute nor a source element child: set the networkState to NETWORK_EMPTY, and abort these steps; the synchronous section ends.

  4. ⌛ Set the media element's delaying-the-load-event flag to true (this delays the load event), and set its networkState to NETWORK_LOADING.

  5. Queue a task to fire a simple event named loadstart at the media element.

  6. If mode is attribute, then run these substeps:

    1. Process candidate: If the src attribute's value is the empty string, then end the synchronous section, and jump down to the failed step below.

    2. ⌛ Let absolute URL be the absolute URL that would have resulted from resolving the URL specified by the src attribute's value relative to the media element when the src attribute was last changed.

    3. ⌛ If absolute URL was obtained successfully, set the currentSrc attribute to absolute URL.

    4. End the synchronous section, continuing the remaining steps asynchronously.

    5. If absolute URL was obtained successfully, run the resource fetch algorithm with absolute URL. If that algorithm returns without aborting this one, then the load failed.

    6. Failed: Reaching this step indicates that the media resource failed to load or that the given URL could not be resolved. In one atomic operation, run the following steps:

      1. Set the error attribute to a new MediaError object whose code attribute is set to MEDIA_ERR_SRC_NOT_SUPPORTED.

      2. Forget the media element's media-resource-specific text tracks.

      3. Set the element's networkState attribute to the NETWORK_NO_SOURCE value.

    7. Queue a task to fire a simple event named error at the media element.

    8. Set the element's delaying-the-load-event flag to false. This stops delaying the load event.

    9. Abort these steps. Until the load() method is invoked or the src attribute is changed, the element won't attempt to load another resource.

    Otherwise, the source elements will be used; run these substeps:

    1. ⌛ Let pointer be a position defined by two adjacent nodes in the media element's child list, treating the start of the list (before the first child in the list, if any) and end of the list (after the last child in the list, if any) as nodes in their own right. One node is the node before pointer, and the other node is the node after pointer. Initially, let pointer be the position between the candidate node and the next node, if there are any, or the end of the list, if it is the last node.

      As nodes are inserted and removed into the media element, pointer must be updated as follows:

      If a new node is inserted between the two nodes that define pointer
      Let pointer be the point between the node before pointer and the new node. In other words, insertions at pointer go after pointer.
      If the node before pointer is removed
      Let pointer be the point between the node after pointer and the node before the node after pointer. In other words, pointer doesn't move relative to the remaining nodes.
      If the node after pointer is removed
      Let pointer be the point between the node before pointer and the node after the node before pointer. Just as with the previous case, pointer doesn't move relative to the remaining nodes.

      Other changes don't affect pointer.

    2. Process candidate: If candidate does not have a src attribute, or if its src attribute's value is the empty string, then end the synchronous section, and jump down to the failed step below.

    3. ⌛ Let absolute URL be the absolute URL that would have resulted from resolving the URL specified by candidate's src attribute's value relative to the candidate when the src attribute was last changed.

    4. ⌛ If absolute URL was not obtained successfully, then end the synchronous section, and jump down to the failed step below.

    5. ⌛ If candidate has a type attribute whose value, when parsed as a MIME type (including any codecs described by the codecs parameter, for types that define that parameter), represents a type that the user agent knows it cannot render, then end the synchronous section, and jump down to the failed step below.

    6. ⌛ If candidate has a media attribute whose value does not match the environment, then end the synchronous section, and jump down to the failed step below.

    7. ⌛ Set the currentSrc attribute to absolute URL.

    8. End the synchronous section, continuing the remaining steps asynchronously.

    9. Run the resource fetch algorithm with absolute URL. If that algorithm returns without aborting this one, then the load failed.

    10. Failed: Queue a task to fire a simple event named error at the candidate element, in the context of the fetching process that was used to try to obtain candidate's corresponding media resource in the resource fetch algorithm.

    11. Asynchronously await a stable state. The synchronous section consists of all the remaining steps of this algorithm until the algorithm says the synchronous section has ended. (Steps in synchronous sections are marked with ⌛.)

    12. Forget the media element's media-resource-specific text tracks.

    13. Find next candidate: Let candidate be null.

    14. Search loop: If the node after pointer is the end of the list, then jump to the waiting step below.

    15. ⌛ If the node after pointer is a source element, let candidate be that element.

    16. ⌛ Advance pointer so that the node before pointer is now the node that was after pointer, and the node after pointer is the node after the node that used to be after pointer, if any.

    17. ⌛ If candidate is null, jump back to the search loop step. Otherwise, jump back to the process candidate step.

    18. Waiting: Set the element's networkState attribute to the NETWORK_NO_SOURCE value.

    19. ⌛ Set the element's delaying-the-load-event flag to false. This stops delaying the load event.

    20. End the synchronous section, continuing the remaining steps asynchronously.

    21. Wait until the node after pointer is a node other than the end of the list. (This step might wait forever.)

    22. Asynchronously await a stable state. The synchronous section consists of all the remaining steps of this algorithm until the algorithm says the synchronous section has ended. (Steps in synchronous sections are marked with ⌛.)

    23. ⌛ Set the element's delaying-the-load-event flag back to true (this delays the load event again, in case it hasn't been fired yet).

    24. ⌛ Set the networkState back to NETWORK_LOADING.

    25. ⌛ Jump back to the find next candidate step above.

The resource fetch algorithm for a media element and a given absolute URL is as follows:

  1. Let the current media resource be the resource given by the absolute URL passed to this algorithm. This is now the element's media resource.

  2. Begin to fetch the current media resource, from the media element's Document's origin, with the force same-origin flag set.

    Every 350ms (±200ms) or for every byte received, whichever is least frequent, queue a task to fire a simple event named progress at the element.

    The stall timeout is a user-agent defined length of time, which should be about three seconds. When a media element that is actively attempting to obtain media data has failed to receive any data for a duration equal to the stall timeout, the user agent must queue a task to fire a simple event named stalled at the element.

    User agents may allow users to selectively block or slow media data downloads. When a media element's download has been blocked altogether, the user agent must act as if it was stalled (as opposed to acting as if the connection was closed). The rate of the download may also be throttled automatically by the user agent, e.g. to balance the download with other connections sharing the same bandwidth.

    User agents may decide to not download more content at any time, e.g. after buffering five minutes of a one hour media resource, while waiting for the user to decide whether to play the resource or not, or while waiting for user input in an interactive resource. When a media element's download has been suspended, the user agent must set the networkState to NETWORK_IDLE and queue a task to fire a simple event named suspend at the element. If and when downloading of the resource resumes, the user agent must set the networkState to NETWORK_LOADING.

    The preload attribute provides a hint regarding how much buffering the author thinks is advisable, even in the absence of the autoplay attribute.

    When a user agent decides to completely stall a download, e.g. if it is waiting until the user starts playback before downloading any further content, the element's delaying-the-load-event flag must be set to false. This stops delaying the load event.

    The user agent may use whatever means necessary to fetch the resource (within the constraints put forward by this and other specifications); for example, reconnecting to the server in the face of network errors, using HTTP range retrieval requests, or switching to a streaming protocol. The user agent must consider a resource erroneous only if it has given up trying to fetch it.

    The networking task source tasks to process the data as it is being fetched must, when appropriate, include the relevant substeps from the following list:

    If the media data cannot be fetched at all, due to network errors, causing the user agent to give up trying to fetch the resource
    If the media resource is found to have Content-Type metadata that, when parsed as a MIME type (including any codecs described by the codecs parameter, if the parameter is defined for that type), represents a type that the user agent knows it cannot render (even if the actual media data is in a supported format)
    If the media data can be fetched but is found by inspection to be in an unsupported format, or can otherwise not be rendered at all

    DNS errors, HTTP 4xx and 5xx errors (and equivalents in other protocols), and other fatal network errors that occur before the user agent has established whether the current media resource is usable, as well as the file using an unsupported container format, or using unsupported codecs for all the data, must cause the user agent to execute the following steps:

    1. The user agent should cancel the fetching process.

    2. Abort this subalgorithm, returning to the resource selection algorithm.

    Once enough of the media data has been fetched to determine the duration of the media resource, its dimensions, and other metadata, and once the text tracks are ready

    This indicates that the resource is usable. The user agent must follow these substeps:

    1. Establish the media timeline for the purposes of the current playback position, the earliest possible position, and the initial playback position, based on the media data.

    2. Update the timeline offset to the date and time that corresponds to the zero time in the media timeline established in the previous step, if any. If no explicit time and date is given by the media resource, the timeline offset must be set to Not-a-Number (NaN).

    3. Set the current playback position to the earliest possible position.

    4. Update the duration attribute with the time of the last frame of the resource, if known, on the media timeline established above. If it is not known (e.g. a stream that is in principle infinite), update the duration attribute to the value positive Infinity.

      The user agent will queue a task to fire a simple event named durationchange at the element at this point.

    5. For video elements, set the videoWidth and videoHeight attributes.

    6. Set the readyState attribute to HAVE_METADATA.

      A loadedmetadata DOM event will be fired as part of setting the readyState attribute to a new value.

    7. Let jumped be false.

    8. If either the media resource or the address of the current media resource indicate a particular start time, then set the initial playback position to that time, seek to that time, and let jumped be true. Ignore any resulting exceptions (if the position is out of range, it is effectively ignored).

      For example, with media formats that support the Media Fragments URI fragment identifier syntax, the fragment identifier can be used to indicate a start position. [MEDIAFRAG]

    9. If either the media resource or the address of the current media resource indicate a particular set of audio or video tracks to enable, then the selected audio tracks must be enabled in the element's audioTracks object, and the first selected video track must be selected in the element's videoTracks object.

    10. If the media element has a current media controller, then: if jumped is true and the initial playback position, relative to the current media controller's timeline, is greater than the current media controller's media controller position, then seek the media controller to the media element's initial playback position, relative to the current media controller's timeline; otherwise, seek the media element to the media controller position, relative to the media element's timeline, discarding any resulting exceptions.

    11. Once the readyState attribute reaches HAVE_CURRENT_DATA, after the loadeddata event has been fired, set the element's delaying-the-load-event flag to false. This stops delaying the load event.

      A user agent that is attempting to reduce network usage while still fetching the metadata for each media resource would also stop buffering at this point, causing the networkState attribute to switch to the NETWORK_IDLE value.

    The user agent is required to determine the duration of the media resource and go through this step before playing.

    Once the entire media resource has been fetched (but potentially before any of it has been decoded)

    Queue a task to fire a simple event named progress at the media element.

    If the connection is interrupted, causing the user agent to give up trying to fetch the resource

    Fatal network errors that occur after the user agent has established whether the current media resource is usable must cause the user agent to execute the following steps:

    1. The user agent should cancel the fetching process.

    2. Set the error attribute to a new MediaError object whose code attribute is set to MEDIA_ERR_NETWORK.

    3. Queue a task to fire a simple event named error at the media element.

    4. If the media element's readyState attribute has a value equal to HAVE_NOTHING, set the element's networkState attribute to the NETWORK_EMPTY value and queue a task to fire a simple event named emptied at the element. Otherwise, set the element's networkState attribute to the NETWORK_IDLE value.

    5. Set the element's delaying-the-load-event flag to false. This stops delaying the load event.

    6. Abort the overall resource selection algorithm.

    If the media data is corrupted

    Fatal errors in decoding the media data that occur after the user agent has established whether the current media resource is usable must cause the user agent to execute the following steps:

    1. The user agent should cancel the fetching process.

    2. Set the error attribute to a new MediaError object whose code attribute is set to MEDIA_ERR_DECODE.

    3. Queue a task to fire a simple event named error at the media element.

    4. If the media element's readyState attribute has a value equal to HAVE_NOTHING, set the element's networkState attribute to the NETWORK_EMPTY value and queue a task to fire a simple event named emptied at the element. Otherwise, set the element's networkState attribute to the NETWORK_IDLE value.

    5. Set the element's delaying-the-load-event flag to false. This stops delaying the load event.

    6. Abort the overall resource selection algorithm.

    If the media data fetching process is aborted by the user

    The fetching process is aborted by the user, e.g. because the user navigated the browsing context to another page, the user agent must execute the following steps. These steps are not followed if the load() method itself is invoked while these steps are running, as the steps above handle that particular kind of abort.

    1. The user agent should cancel the fetching process.

    2. Set the error attribute to a new MediaError object whose code attribute is set to MEDIA_ERR_ABORTED.

    3. Queue a task to fire a simple event named abort at the media element.

    4. If the media element's readyState attribute has a value equal to HAVE_NOTHING, set the element's networkState attribute to the NETWORK_EMPTY value and queue a task to fire a simple event named emptied at the element. Otherwise, set the element's networkState attribute to the NETWORK_IDLE value.

    5. Set the element's delaying-the-load-event flag to false. This stops delaying the load event.

    6. Abort the overall resource selection algorithm.

    If the media data can be fetched but has non-fatal errors or uses, in part, codecs that are unsupported, preventing the user agent from rendering the content completely correctly but not preventing playback altogether

    The server returning data that is partially usable but cannot be optimally rendered must cause the user agent to render just the bits it can handle, and ignore the rest.

    If the media resource is found to declare a media-resource-specific text track that the user agent supports

    If the media resource's origin is the same origin as the media element's Document's origin, queue a task to run the steps to expose a media-resource-specific text track with the relevant data.

    Cross-origin files do not expose their subtitles in the DOM, for security reasons. However, user agents may still provide the user with access to such data in their user interface.

    When the networking task source has queued the last task as part of fetching the media resource (i.e. once the download has completed), if the fetching process completes without errors, including decoding the media data, and if all of the data is available to the user agent without network access, then, the user agent must move on to the next step. This might never happen, e.g. when streaming an infinite resource such as Web radio, or if the resource is longer than the user agent's ability to cache data.

    While the user agent might still need network access to obtain parts of the media resource, the user agent must remain on this step.

    For example, if the user agent has discarded the first half of a video, the user agent will remain at this step even once the playback has ended, because there is always the chance the user will seek back to the start. In fact, in this situation, once playback has ended, the user agent will end up dispatching a stalled event, as described earlier.

  3. If the user agent ever reaches this step (which can only happen if the entire resource gets loaded and kept available): abort the overall resource selection algorithm.


The preload attribute is an enumerated attribute. The following table lists the keywords and states for the attribute — the keywords in the left column map to the states in the cell in the second column on the same row as the keyword.

Keyword State Brief description
none None Hints to the user agent that either the author does not expect the user to need the media resource, or that the server wants to minimise unnecessary traffic.
metadata Metadata Hints to the user agent that the author does not expect the user to need the media resource, but that fetching the resource metadata (dimensions, first frame, track list, duration, etc) is reasonable.
auto Automatic Hints to the user agent that the user agent can put the user's needs first without risk to the server, up to and including optimistically downloading the entire resource.

The empty string is also a valid keyword, and maps to the Automatic state. The attribute's missing value default is user-agent defined, though the Metadata state is suggested as a compromise between reducing server load and providing an optimal user experience.

The preload attribute is intended to provide a hint to the user agent about what the author thinks will lead to the best user experience. The attribute may be ignored altogether, for example based on explicit user preferences or based on the available connectivity.

The preload IDL attribute must reflect the content attribute of the same name, limited to only known values.

The autoplay attribute can override the preload attribute (since if the media plays, it naturally has to buffer first, regardless of the hint given by the preload attribute). Including both is not an error, however.


media . buffered

Returns a TimeRanges object that represents the ranges of the media resource that the user agent has buffered.

The buffered attribute must return a new static normalized TimeRanges object that represents the ranges of the media resource, if any, that the user agent has buffered, at the time the attribute is evaluated. Users agents must accurately determine the ranges available, even for media streams where this can only be determined by tedious inspection.

Typically this will be a single range anchored at the zero point, but if, e.g. the user agent uses HTTP range requests in response to seeking, then there could be multiple ranges.

User agents may discard previously buffered data.

Thus, a time position included within a range of the objects return by the buffered attribute at one time can end up being not included in the range(s) of objects returned by the same attribute at later times.

4.8.10.6 Offsets into the media resource
media . duration

Returns the length of the media resource, in seconds, assuming that the start of the media resource is at time zero.

Returns NaN if the duration isn't available.

Returns Infinity for unbounded streams.

media . currentTime [ = value ]

Returns the current playback position, in seconds.

Can be set, to seek to the given time.

Will throw an INVALID_STATE_ERR exception if there is no selected media resource or if there is a current media controller. Will throw an INDEX_SIZE_ERR exception if the given time is not within the ranges to which the user agent can seek.

media . initialTime

Returns the initial playback position, that is, time to which the media resource was automatically seeked when it was loaded. Returns zero if the initial playback position is still unknown.

A media resource has a media timeline that maps times (in seconds) to positions in the media resource. The origin of a timeline is its earliest defined position. The duration of a timeline is its last defined position.

Establishing the media timeline: If the media resource somehow specifies an explicit timeline whose origin is not negative, then the media timeline should be that timeline. (Whether the media resource can specify a timeline or not depends on the media resource's format.) If the media resource specifies an explicit start time and date, then that time and date should be considered the zero point in the media timeline; the timeline offset will be the time and date, exposed using the startOffsetTime attribute.

If the media resource has a discontinuous timeline, the user agent must extend the timeline used at the start of the resource across the entire resource, so that the media timeline of the media resource increases linearly starting from the earliest possible position (as defined below), even if the underlying media data has out-of-order or even overlapping time codes.

For example, if two clips have been concatenated into one video file, but the video format exposes the original times for the two clips, the video data might expose a timeline that goes, say, 00:15..00:29 and then 00:05..00:38. However, the user agent would not expose those times; it would instead expose the times as 00:15..00:29 and 00:29..01:02, as a single video.

In the absence of an explicit timeline, the zero time on the media timeline should correspond to the first frame of the media resource. For static audio and video files this is generally trivial. For streaming resources, if the user agent will be able to seek to an earlier point than the first frame originally provided by the server, then the zero time should correspond to the earliest seekable time of the media resource; otherwise, it should correspond to the first frame received from the server (the point in the media resource at which the user agent began receiving the stream).

Another example would be a stream that carries a video with several concatenated fragments, broadcast by a server that does not allow user agents to request specific times but instead just streams the video data in a predetermined order. If a user agent connects to this stream and receives fragments defined as covering timestamps 2010-03-20 23:15:00 UTC to 2010-03-21 00:05:00 UTC and 2010-02-12 14:25:00 UTC to 2010-02-12 14:35:00 UTC, it would expose this with a media timeline starting at 0s and extending to 3,600s (one hour). Assuming the streaming server disconnected at the end of the second clip, the duration attribute would then return 3,600. The startOffsetTime attribute would return a Date object with a time corresponding to 2010-03-20 23:15:00 UTC. However, if a different user agent connected five minutes later, it would (presumably) receive fragments covering timestamps 2010-03-20 23:20:00 UTC to 2010-03-21 00:05:00 UTC and 2010-02-12 14:25:00 UTC to 2010-02-12 14:35:00 UTC, and would expose this with a media timeline starting at 0s and extending to 3,300s (fifty five minutes). In this case, the startOffsetTime attribute would return a Date object with a time corresponding to 2010-03-20 23:20:00 UTC.

In any case, the user agent must ensure that the earliest possible position (as defined below) using the established media timeline, is greater than or equal to zero.

The media timeline also has an associated clock. Which clock is used is user-agent defined, and may be media resource-dependent, but it should approximate the user's wall clock.

All the media elements that share current media controller use the same clock for their media timeline.

Media elements have a current playback position, which must initially (i.e. in the absence of media data) be zero seconds. The current playback position is a time on the media timeline.

The currentTime attribute must, on getting, return the current playback position, expressed in seconds. On setting, if the media element has a current media controller, then it must throw an INVALID_STATE_ERR exception; otherwise, the user agent must seek to the new value (which might raise an exception).

Media elements have an initial playback position, which must initially (i.e. in the absence of media data) be zero seconds. The initial playback position is updated when a media resource is loaded. The initial playback position is a time on the media timeline.

The initialTime attribute must, on getting, return the initial playback position, expressed in seconds.

If the media resource is a streaming resource, then the user agent might be unable to obtain certain parts of the resource after it has expired from its buffer. Similarly, some media resources might have a media timeline that doesn't start at zero. The earliest possible position is the earliest position in the stream or resource that the user agent can ever obtain again. It is also a time on the media timeline.

The earliest possible position is not explicitly exposed in the API; it corresponds to the start time of the first range in the seekable attribute's TimeRanges object, if any, or the current playback position otherwise.

When the earliest possible position changes, then: if the current playback position is before the earliest possible position, the user agent must seek to the earliest possible position; otherwise, if the user agent has not fired a timeupdate event at the element in the past 15 to 250ms and is not still running event handlers for such an event, then the user agent must queue a task to fire a simple event named timeupdate at the element.

Because of the above requirement and the requirement in the resource fetch algorithm that kicks in when the metadata of the clip becomes known, the current playback position can never be less than the earliest possible position.

The duration attribute must return the time of the end of the media resource, in seconds, on the media timeline. If no media data is available, then the attributes must return the Not-a-Number (NaN) value. If the media resource is known to be unbounded (e.g. a streaming radio), then the attribute must return the positive Infinity value.

The user agent must determine the duration of the media resource before playing any part of the media data and before setting readyState to a value equal to or greater than HAVE_METADATA, even if doing so requires fetching multiple parts of the resource.

When the length of the media resource changes to a known value (e.g. from being unknown to known, or from a previously established length to a new length) the user agent must queue a task to fire a simple event named durationchange at the media element. (The event is not fired when the duration is reset as part of loading a new media resource.)

If an "infinite" stream ends for some reason, then the duration would change from positive Infinity to the time of the last frame or sample in the stream, and the durationchange event would be fired. Similarly, if the user agent initially estimated the media resource's duration instead of determining it precisely, and later revises the estimate based on new information, then the duration would change and the durationchange event would be fired.

Some video files also have an explicit date and time corresponding to the zero time in the media timeline, known as the timeline offset. Initially, the timeline offset must be set to Not-a-Number (NaN).

The startOffsetTime attribute must return a new Date object representing the current timeline offset.


The loop attribute is a boolean attribute that, if specified, indicates that the media element is to seek back to the start of the media resource upon reaching the end.

The loop attribute has no effect while the element has a current media controller.

The loop IDL attribute must reflect the content attribute of the same name.

4.8.10.7 The ready states
media . readyState

Returns a value that expresses the current state of the element with respect to rendering the current playback position, from the codes in the list below.

Media elements have a ready state, which describes to what degree they are ready to be rendered at the current playback position. The possible values are as follows; the ready state of a media element at any particular time is the greatest value describing the state of the element:

HAVE_NOTHING (numeric value 0)
No information regarding the media resource is available. No data for the current playback position is available. Media elements whose networkState attribute are set to NETWORK_EMPTY are always in the HAVE_NOTHING state.
HAVE_METADATA (numeric value 1)
Enough of the resource has been obtained that the duration of the resource is available. In the case of a video element, the dimensions of the video are also available. The API will no longer raise an exception when seeking. No media data is available for the immediate current playback position. The text tracks are ready.
HAVE_CURRENT_DATA (numeric value 2)
Data for the immediate current playback position is available, but either not enough data is available that the user agent could successfully advance the current playback position in the direction of playback at all without immediately reverting to the HAVE_METADATA state, or there is no more data to obtain in the direction of playback. For example, in video this corresponds to the user agent having data from the current frame, but not the next frame; and to when playback has ended.
HAVE_FUTURE_DATA (numeric value 3)
Data for the immediate current playback position is available, as well as enough data for the user agent to advance the current playback position in the direction of playback at least a little without immediately reverting to the HAVE_METADATA state. For example, in video this corresponds to the user agent having data for at least the current frame and the next frame. The user agent cannot be in this state if playback has ended, as the current playback position can never advance in this case.
HAVE_ENOUGH_DATA (numeric value 4)
All the conditions described for the HAVE_FUTURE_DATA state are met, and, in addition, the user agent estimates that data is being fetched at a rate where the current playback position, if it were to advance at the effective playback rate, would not overtake the available data before playback reaches the end of the media resource.

When the ready state of a media element whose networkState is not NETWORK_EMPTY changes, the user agent must follow the steps given below:

  1. Apply the first applicable set of substeps from the following list:

    If the previous ready state was HAVE_NOTHING, and the new ready state is HAVE_METADATA

    Queue a task to fire a simple event named loadedmetadata at the element.

    Before this task is run, as part of the event loop mechanism, the rendering will have been updated to resize the video element if appropriate.

    If the previous ready state was HAVE_METADATA and the new ready state is HAVE_CURRENT_DATA or greater

    If this is the first time this occurs for this media element since the load() algorithm was last invoked, the user agent must queue a task to fire a simple event named loadeddata at the element.

    If the new ready state is HAVE_FUTURE_DATA or HAVE_ENOUGH_DATA, then the relevant steps below must then be run also.

    If the previous ready state was HAVE_FUTURE_DATA or more, and the new ready state is HAVE_CURRENT_DATA or less

    If the media element was potentially playing before its readyState attribute changed to a value lower than HAVE_FUTURE_DATA, and the element has not ended playback, and playback has not stopped due to errors, and playback has not paused for user interaction, the user agent must queue a task to fire a simple event named timeupdate at the element, and queue a task to fire a simple event named waiting at the element.

    If the previous ready state was HAVE_CURRENT_DATA or less, and the new ready state is HAVE_FUTURE_DATA

    The user agent must queue a task to fire a simple event named canplay.

    If the element's paused attribute is false, the user agent must queue a task to fire a simple event named playing.

    If the new ready state is HAVE_ENOUGH_DATA

    If the previous ready state was HAVE_CURRENT_DATA or less, the user agent must queue a task to fire a simple event named canplay, and, if the element's paused attribute is false, queue a task to fire a simple event named playing.

    If the autoplaying flag is true, and the paused attribute is true, and the media element has an autoplay attribute specified, and the media element's Document's browsing context did not have the sandboxed automatic features browsing context flag set when the Document was created, then the user agent may also set the paused attribute to false, queue a task to fire a simple event named play, and queue a task to fire a simple event named playing.

    User agents do not need to support autoplay, and it is suggested that user agents honor user preferences on the matter. Authors are urged to use the autoplay attribute rather than using script to force the video to play, so as to allow the user to override the behavior if so desired.

    In any case, the user agent must finally queue a task to fire a simple event named canplaythrough.

  2. If the media element has a current media controller, then report the controller state for the media element's current media controller.

It is possible for the ready state of a media element to jump between these states discontinuously. For example, the state of a media element can jump straight from HAVE_METADATA to HAVE_ENOUGH_DATA without passing through the HAVE_CURRENT_DATA and HAVE_FUTURE_DATA states.

The readyState IDL attribute must, on getting, return the value described above that describes the current ready state of the media element.

The autoplay attribute is a boolean attribute. When present, the user agent (as described in the algorithm described herein) will automatically begin playback of the media resource as soon as it can do so without stopping.

Authors are urged to use the autoplay attribute rather than using script to trigger automatic playback, as this allows the user to override the automatic playback when it is not desired, e.g. when using a screen reader. Authors are also encouraged to consider not using the automatic playback behavior at all, and instead to let the user agent wait for the user to start playback explicitly.

The autoplay IDL attribute must reflect the content attribute of the same name.

4.8.10.8 Playing the media resource
media . paused

Returns true if playback is paused; false otherwise.

media . ended

Returns true if playback has reached the end of the media resource.

media . defaultPlaybackRate [ = value ]

Returns the default rate of playback, for when the user is not fast-forwarding or reversing through the media resource.

Can be set, to change the default rate of playback.

The default rate has no direct effect on playback, but if the user switches to a fast-forward mode, when they return to the normal playback mode, it is expected that the rate of playback will be returned to the default rate of playback.

When the element has a current media controller, the defaultPlaybackRate attribute is ignored and the current media controller's defaultPlaybackRate is used instead.

media . playbackRate [ = value ]

Returns the current rate playback, where 1.0 is normal speed.

Can be set, to change the rate of playback.

When the element has a current media controller, the playbackRate attribute is ignored and the current media controller's playbackRate is used instead.

media . played

Returns a TimeRanges object that represents the ranges of the media resource that the user agent has played.

media . play()

Sets the paused attribute to false, loading the media resource and beginning playback if necessary. If the playback had ended, will restart it from the start.

media . pause()

Sets the paused attribute to true, loading the media resource if necessary.

The paused attribute represents whether the media element is paused or not. The attribute must initially be true.

A media element is a blocked media element if its readyState attribute is in the HAVE_NOTHING state or the HAVE_METADATA state, or if the element has paused for user interaction.

A media element is said to be potentially playing when its paused attribute is false, the element has not ended playback, playback has not stopped due to errors, the element either has no current media controller or has a current media controller but is not blocked on its media controller, and the element is not a blocked media element.

A media element is said to have ended playback when:

The ended attribute must return true if the media element has ended playback and the direction of playback is forwards, and false otherwise.

A media element is said to have stopped due to errors when the element's readyState attribute is HAVE_METADATA or greater, and the user agent encounters a non-fatal error during the processing of the media data, and due to that error, is not able to play the content at the current playback position.

A media element is said to have paused for user interaction when its paused attribute is false, the readyState attribute is either HAVE_FUTURE_DATA or HAVE_ENOUGH_DATA and the user agent has reached a point in the media resource where the user has to make a selection for the resource to continue. If the media element has a current media controller when this happens, then the user agent must report the controller state for the media element's current media controller. If If the media element has a current media controller when the user makes a selection, allowing playback to resume, the user agent must similarly report the controller state for the media element's current media controller.

It is possible for a media element to have both ended playback and paused for user interaction at the same time.

When a media element that is potentially playing stops playing because it has paused for user interaction, the user agent must queue a task to fire a simple event named timeupdate at the element.

A waiting DOM event can be fired as a result of an element that is potentially playing stopping playback due to its readyState attribute changing to a value lower than HAVE_FUTURE_DATA.

When the current playback position reaches the end of the media resource when the direction of playback is forwards, then the user agent must follow these steps:

  1. If the media element has a loop attribute specified and does not have a current media controller, then seek to the earliest possible position of the media resource and abort these steps.

  2. Stop playback.

    The ended attribute becomes true.

  3. The user agent must queue a task to fire a simple event named timeupdate at the element.

  4. The user agent must queue a task to fire a simple event named ended at the element.

When the current playback position reaches the earliest possible position of the media resource when the direction of playback is backwards, then the user agent must follow these steps:

  1. Stop playback.

  2. The user agent must queue a task to fire a simple event named timeupdate at the element.


The defaultPlaybackRate attribute gives the desired speed at which the media resource is to play, as a multiple of its intrinsic speed. The attribute is mutable: on getting it must return the last value it was set to, or 1.0 if it hasn't yet been set; on setting the attribute must be set to the new value.

The defaultPlaybackRate is used by the user agent when it exposes a user interface to the user.

The playbackRate attribute gives the effective playback rate (assuming there is no current media controller overriding it), which is the speed at which the media resource plays, as a multiple of its intrinsic speed. If it is not equal to the defaultPlaybackRate, then the implication is that the user is using a feature such as fast forward or slow motion playback. The attribute is mutable: on getting it must return the last value it was set to, or 1.0 if it hasn't yet been set; on setting the attribute must be set to the new value, and the playback will change speed (if the element is potentially playing and there is no current media controller).

When the defaultPlaybackRate or playbackRate attributes change value (either by being set by script or by being changed directly by the user agent, e.g. in response to user control) the user agent must queue a task to fire a simple event named ratechange at the media element.

The defaultPlaybackRate and playbackRate attributes have no effect when the media element has a current media controller; the namesake attributes on the MediaController object are used instead in that situation.


The played attribute must return a new static normalized TimeRanges object that represents the ranges of the media resource, if any, that the user agent has so far rendered, at the time the attribute is evaluated.


When the play() method on a media element is invoked, the user agent must run the following steps.

  1. If the media element's networkState attribute has the value NETWORK_EMPTY, invoke the media element's resource selection algorithm.

  2. If the playback has ended and the direction of playback is forwards, and the media element does not have a current media controller, seek to the earliest possible position of the media resource.

    This will cause the user agent to queue a task to fire a simple event named timeupdate at the media element.

  3. If the media element has a current media controller, then bring the media element up to speed with its new media controller.

  4. If the media element's paused attribute is true, run the following substeps:

    1. Change the value of paused to false.

    2. Queue a task to fire a simple event named play at the element.

    3. If the media element's readyState attribute has the value HAVE_NOTHING, HAVE_METADATA, or HAVE_CURRENT_DATA, queue a task to fire a simple event named waiting at the element.

      Otherwise, the media element's readyState attribute has the value HAVE_FUTURE_DATA or HAVE_ENOUGH_DATA: queue a task to fire a simple event named playing at the element.

  5. Set the media element's autoplaying flag to false.

  6. If the media element has a current media controller, then report the controller state for the media element's current media controller.


When the pause() method is invoked, and when the user agent is required to pause the media element, the user agent must run the following steps:

  1. If the media element's networkState attribute has the value NETWORK_EMPTY, invoke the media element's resource selection algorithm.

  2. Set the media element's autoplaying flag to false.

  3. If the media element's paused attribute is false, run the following steps:

    1. Change the value of paused to true.

    2. Queue a task to fire a simple event named timeupdate at the element.

    3. Queue a task to fire a simple event named pause at the element.

  4. If the media element has a current media controller, then report the controller state for the media element's current media controller.


The effective playback rate is not necessarily the element's playbackRate. When a media element has a current media controller, its effective playback rate is the MediaController's media controller playback rate. Otherwise, the effective playback rate is just the element's playbackRate. Thus, the current media controller overrides the media element.

If the effective playback rate is positive or zero, then the direction of playback is forwards. Otherwise, it is backwards.

When a media element is potentially playing and its Document is a fully active Document, its current playback position must increase monotonically at effective playback rate units of media time per unit time of the media timeline's clock.

The effective playback rate can be 0.0, in which case the current playback position doesn't move, despite playback not being paused (paused doesn't become true, and the pause event doesn't fire).

This specification doesn't define how the user agent achieves the appropriate playback rate — depending on the protocol and media available, it is plausible that the user agent could negotiate with the server to have the server provide the media data at the appropriate rate, so that (except for the period between when the rate is changed and when the server updates the stream's playback rate) the client doesn't actually have to drop or interpolate any frames.

When the direction of playback is backwards, any corresponding audio must be muted. When the effective playback rate is so low or so high that the user agent cannot play audio usefully, the corresponding audio must also be muted. If the effective playback rate is not 1.0, the user agent may apply pitch adjustments to the audio as necessary to render it faithfully.

Media elements that are potentially playing while not in a Document must not play any video, but should play any audio component. Media elements must not stop playing just because all references to them have been removed; only once a media element is in a state where no further audio could ever be played by that element may the element be garbage collected.

It is possible for an element to which no explicit references exist to play audio, even if such an element is not still actively playing: for instance, it could have a current media controller that still has references and can still be unpaused, or it could be unpaused but stalled waiting for content to buffer.


When the current playback position of a media element changes (e.g. due to playback or seeking), the user agent must run the following steps. If the current playback position changes while the steps are running, then the user agent must wait for the steps to complete, and then must immediately rerun the steps. (These steps are thus run as often as possible or needed — if one iteration takes a long time, this can cause certain cues to be skipped over as the user agent rushes ahead to "catch up".)

  1. Let current cues be an ordered list of cues, initialized to contain all the cues of all the hidden, showing, or showing by default text tracks of the media element (not the disabled ones) whose start times are less than or equal to the current playback position and whose end times are greater than the current playback position, in text track cue order.

  2. Let other cues be an ordered list of cues, initialized to contain all the cues of hidden, showing, and showing by default text tracks of the media element that are not present in current cues, also in text track cue order.

  3. If the time was reached through the usual monotonic increase of the current playback position during normal playback, and if the user agent has not fired a timeupdate event at the element in the past 15 to 250ms and is not still running event handlers for such an event, then the user agent must queue a task to fire a simple event named timeupdate at the element. (In the other cases, such as explicit seeks, relevant events get fired as part of the overall process of changing the current playback position.)

    The event thus is not to be fired faster than about 66Hz or slower than 4Hz (assuming the event handlers don't take longer than 250ms to run). User agents are encouraged to vary the frequency of the event based on the system load and the average cost of processing the event each time, so that the UI updates are not any more frequent than the user agent can comfortably handle while decoding the video.

  4. If all of the cues in current cues have their text track cue active flag set, and none of the cues in other cues have their text track cue active flag set, then abort these steps.

  5. If the time was reached through the usual monotonic increase of the current playback position during normal playback, and there are cues in other cues that have both their text track cue active flag set and their text track cue pause-on-exit flag set, then immediately pause the media element. (In the other cases, such as explicit seeks, playback is not paused by going past the end time of a cue, even if that cue has its text track cue pause-on-exit flag set.)

  6. Let affected tracks be a list of text tracks, initially empty.

  7. For each text track cue in other cues that has its text track cue active flag set, in list order, queue a task to fire a simple event named exit at the TextTrackCue object, and add the cue's text track to affected tracks, if it's not already in the list.

  8. For each text track cue in current cues that does not have its text track cue active flag set, in list order, queue a task to fire a simple event named enter at the TextTrackCue object, and add the cue's text track to affected tracks, if it's not already in the list.

  9. For each text track in affected tracks, in the order they were added to the list (which will match the relative order of the text tracks in the media element's list of text tracks), queue a task to fire a simple event named cuechange at the TextTrack object, and, if the text track has a corresponding track element, to then fire a simple event named cuechange at the track element as well.

  10. Set the text track cue active flag of all the cues in the current cues, and unset the text track cue active flag of all the cues in the other cues.

  11. Run the rules for updating the text track rendering of each of the text tracks in affected tracks that are showing or showing by default.

For the purposes of the algorithm above, a text track cue is considered to be part of a text track only if it is listed in the text track list of cues, not merely if it is associated with the text track.

If the media element's Document stops being a fully active document, then the playback will stop until the document is active again.

When a media element is removed from a Document, the user agent must run the following steps:

  1. Asynchronously await a stable state, allowing the task that removed the media element from the Document to continue. The synchronous section consists of all the remaining steps of this algorithm. (Steps in the synchronous section are marked with ⌛.)

  2. ⌛ If the media element is in a Document, abort these steps.

  3. ⌛ If the media element's networkState attribute has the value NETWORK_EMPTY, abort these steps.

  4. Pause the media element.

4.8.10.9 Seeking
media . seeking

Returns true if the user agent is currently seeking.

media . seekable

Returns a TimeRanges object that represents the ranges of the media resource to which it is possible for the user agent to seek.

The seeking attribute must initially have the value false.

When the user agent is required to seek to a particular new playback position in the media resource, it means that the user agent must run the following steps. This algorithm interacts closely with the event loop mechanism; in particular, it has a synchronous section (which is triggered as part of the event loop algorithm). Steps in that section are marked with ⌛.

  1. If the media element's readyState is HAVE_NOTHING, then raise an INVALID_STATE_ERR exception (if the seek was in response to a DOM method call or setting of an IDL attribute), and abort these steps.

  2. If the element's seeking IDL attribute is true, then another instance of this algorithm is already running. Abort that other instance of the algorithm without waiting for the step that it is running to complete.

  3. Set the seeking IDL attribute to true.

  4. If the seek was in response to a DOM method call or setting of an IDL attribute, then continue the script. The remainder of these steps must be run asynchronously. With the exception of the steps marked with ⌛, they could be aborted at any time by another instance of this algorithm being invoked.

  5. If the new playback position is later than the end of the media resource, then let it be the end of the media resource instead.

  6. If the new playback position is less than the earliest possible position, let it be that position instead.

  7. If the (possibly now changed) new playback position is not in one of the ranges given in the seekable attribute, then let it be the position in one of the ranges given in the seekable attribute that is the nearest to the new playback position. If two positions both satisfy that constraint (i.e. the new playback position is exactly in the middle between two ranges in the seekable attribute) then use the position that is closest to the current playback position. If there are no ranges given in the seekable attribute then set the seeking IDL attribute to false and abort these steps.

  8. Queue a task to fire a simple event named seeking at the element.

  9. Set the current playback position to the given new playback position.

    If the media element was potentially playing immediately before it started seeking, but seeking caused its readyState attribute to change to a value lower than HAVE_FUTURE_DATA, then a waiting will be fired at the element.

  10. Wait until the user agent has established whether or not the media data for the new playback position is available, and, if it is, until it has decoded enough data to play back that position.

  11. Await a stable state. The synchronous section consists of all the remaining steps of this algorithm. (Steps in the synchronous section are marked with ⌛.)

  12. ⌛ Set the seeking IDL attribute to false.

  13. Queue a task to fire a simple event named timeupdate at the element.

  14. Queue a task to fire a simple event named seeked at the element.

The seekable attribute must return a new static normalized TimeRanges object that represents the ranges of the media resource, if any, that the user agent is able to seek to, at the time the attribute is evaluated.

If the user agent can seek to anywhere in the media resource, e.g. because it is a simple movie file and the user agent and the server support HTTP Range requests, then the attribute would return an object with one range, whose start is the time of the first frame (the earliest possible position, typically zero), and whose end is the same as the time of the first frame plus the duration attribute's value (which would equal the time of the last frame, and might be positive Infinity).

The range might be continuously changing, e.g. if the user agent is buffering a sliding window on an infinite stream. This is the behavior seen with DVRs viewing live TV, for instance.

Media resources might be internally scripted or interactive. Thus, a media element could play in a non-linear fashion. If this happens, the user agent must act as if the algorithm for seeking was used whenever the current playback position changes in a discontinuous fashion (so that the relevant events fire). If the media element has a current media controller, then the user agent must seek the media controller appropriately instead.

4.8.10.10 Media resources with multiple media tracks

A media resource can have multiple embedded audio and video tracks. For example, in addition to the primary video and audio tracks, a media resource could have foreign-language dubbed dialogues, director's commentaries, audio descriptions, alternative angles, or sign-language overlays.

media . audioTracks

Returns a MultipleTrackList object representing the audio tracks available in the media resource.

media . videoTracks

Returns an ExclusiveTrackList object representing the video tracks available in the media resource.

The audioTracks attribute of a media element must return a live MultipleTrackList object representing the audio tracks available in the media element's media resource. The same object must be returned each time.

The videoTracks attribute of a media element must return a live ExclusiveTrackList object representing the video tracks available in the media element's media resource. The same object must be returned each time.

There are only ever two TrackList objects (one MultipleTrackList object and one ExclusiveTrackList object) per media element, even if another media resource is loaded into the element: the objects are reused.

4.8.10.10.1 TrackList objects

The MultipleTrackList and ExclusiveTrackList interfaces, used by the attributes defined in the previous section, are substantially similar. Their common features are defined in the TrackList interface, from which they both inherit.

interface TrackList {
  readonly attribute unsigned long length;
  DOMString getID(in unsigned long index);
  DOMString getKind(in unsigned long index);
  DOMString getLabel(in unsigned long index);
  DOMString getLanguage(in unsigned long index);

           attribute Function onchange;
};

interface MultipleTrackList : TrackList {
  boolean isEnabled(in unsigned long index);
  void enable(in unsigned long index);
  void disable(in unsigned long index);
};

interface ExclusiveTrackList : TrackList {
  readonly attribute unsigned long selectedIndex;
  void select(in unsigned long index);
};
tracks . length

Returns the number of tracks in the list.

id = tracks . getID( index )

Returns the ID of the given track. This is the ID that can be used with a fragment identifier if the format supports the Media Fragments URI syntax. [MEDIAFRAG]

kind = tracks . getKind( index )

Returns the category the given track falls into. The possible track categories are given below.

label = tracks . getLabel( index )

Returns the label of the given track, if known, or the empty string otherwise.

language = tracks . getLanguage( index )

Returns the language of the given track, if known, or the empty string otherwise.

enabled = audioTracks . isEnabled( index )

Returns true if the given track is active, and false otherwise.

audioTracks . enable( index )

Enables the given track.

audioTracks . disable( index )

Disables the given track.

videoTracks . isEnabled

Returns the index of the currently selected track, if any, or −1 otherwise.

videoTracks . select( index )

Selects the given track.

The length attribute must return the number of tracks represented by the TrackList object at the time of getting.

Tracks in a TrackList object must be consistently ordered. If the media resource is in a format that defines an order, then that order must be used; otherwise, the order must be the relative order in which the tracks are declared in the media resource. Each track in a TrackList thus has an index; the first has the index 0, and each subsequent track is numbered one higher than the previous one.

The getID(index) method must return the identifier of the track with index index, if there is one. If there is no such track, then the method must instead throw an INDEX_SIZE_ERR exception. If the media resource is in a format that supports the Media Fragments URI fragment identifier syntax, the returned identifier must be the same identifier that would enable the track if used as the name of a track in the track dimension of such a fragment identifier. [MEDIAFRAG]

The getKind(index) method must return the category of the track with index index, if there is one. If there is no such track, then the method must instead throw an INDEX_SIZE_ERR exception.

The category of a track is the string given in the first column of the table below that is the most appropriate for the track based on the definitions in the table's second and third columns, as determined by the metadata included in the track in the media resource. For Ogg files, the Role header of the track gives the relevant metadata. The cell in the third column of a row says what the category given in the cell in the first column of that row applies to; a category is only appropriate for an audio track if it applies to audio tracks, and a category is only appropriate for video tracks if it applies to video tracks.

Return values for getKind()
Category Definition Applies to... Examples
"alternative" A possible alternative to the main track, e.g. a different take of a song (audio), or a different angle (video). Audio and video. Ogg: "audio/alterate" or "video/alternate".
"description" An audio description of a video track. Audio only. Ogg: "audio/audiodesc".
"main" The primary audio or video track. Audio and video. Ogg: "audio/main" or "video/main"; WebM: the "FlagDefault" element is set.
"sign" A sign-language interpretation of an audio track. Video only. Ogg: "video/sign".
"translation" A translated version of the main track. Audio only. Ogg: "audio/dub".
"" (empty string) No explicit kind, or the kind given by the track's metadata is not recognised by the user agent. Audio and video. Any other track type or track role.

The getLabel(index) method must return the label of the track with index index, if there is one and it has a label. If there is no such track, then the method must instead throw an INDEX_SIZE_ERR exception. If there is a track with index index, but it has no label, then the method must return the empty string.

The getLanguage(index) method must return the BCP 47 language tag of the language of the track with index index, if there is one and it has a language. If there is no such track, then the method must instead throw an INDEX_SIZE_ERR exception. If there is a track with index index, but it has no language, or the user agent is not able to express that language as a BCP 47 language tag (for example because the language information in the media resource's format is a free-form string without a defined interpretation), then the method must return the empty string.


A MultipleTrackList object represents a track list where multiple tracks can be enabled simultaneously. Each track is either enabled or disabled.

The isEnabled(index) method must return true if there is a track with index index, and it is currently enabled, false if there is a track with index index, but it is currently disabled, and must throw an INDEX_SIZE_ERR exception if there is no track with index index.

The enable(index) method must enable the track with index index, if there is one. If there is not, it must instead throw an INDEX_SIZE_ERR exception.

The disable(index) method must disable the track with index index, if there is one. If there is not, it must instead throw an INDEX_SIZE_ERR exception.

Whenever a track is enabled or disabled, the user agent must queue a task to fire a simple event named change at the MultipleTrackList object.


An ExclusiveTrackList object represents a track list where exactly one track is selected at a time.

The selectedIndex attribute must return the index of the currently selected track. If the ExclusiveTrackList object does not represent any tracks, it must instead return −1.

The select(index) must select the track with index index, if there is one, unselecting whichever track was previously selected. If there is no track with index index, it must instead throw an INDEX_SIZE_ERR exception.

Whenever the selected track is changed, the user agent must queue a task to fire a simple event named change at the MultipleTrackList object.


The following are the event handlers (and their corresponding event handler event types) that must be supported, as IDL attributes, by all objects implementing the TrackList interface:

Event handler Event handler event type
onchange change

The task source for the tasks listed in this section is the DOM manipulation task source.

4.8.10.10.2 Selecting specific audio and video tracks declaratively

The audioTracks and videoTracks attributes allow scripts to select which track should play, but it is also possible to select specific tracks declaratively, by specifying particular tracks in the fragment identifier of the URL of the media resource. The format of the fragment identifier depends on the MIME type of the media resource. [RFC2046] [RFC3986]

In this example, a video that uses a format that supports the Media Fragments URI fragment identifier syntax is embedded in such a way that the alternative angles labeled "Alternative" are enabled instead of the default video track. [MEDIAFRAG]

<video src="myvideo#track=Alternative"></video>
4.8.10.11 Synchronising multiple media elements
4.8.10.11.1 Introduction

Each media element can have a MediaController. A MediaController is an object that coordinates the playback of multiple media elements, for instance so that a sign-language interpreter track can be overlaid on a video track, with the two being kept in sync.

By default, a media element has no MediaController. An implicit MediaController can be assigned using the mediagroup content attribute. An explicit MediaController can be assigned directly using the controller IDL attribute.

Media elements with a MediaController are said to be slaved to their controller. The MediaController modifies the playback rate and the playback volume of each of the media elements slaved to it, and ensures that when any of its slaved media elements unexpectedly stall, the others are stopped at the same time.

When a media element is slaved to a MediaController, its playback rate is fixed to that of the other tracks in the same MediaController, and any looping is disabled.

4.8.10.11.2 Media controllers
[Constructor]
interface MediaController {
  readonly attribute TimeRanges buffered;
  readonly attribute TimeRanges seekable;
  readonly attribute double duration;
           attribute double currentTime;

  readonly attribute boolean paused;
  readonly attribute TimeRanges played;
  void play();
  void pause();

           attribute double defaultPlaybackRate;
           attribute double playbackRate;

           attribute double volume;
           attribute boolean muted;

           attribute Function onemptied;
           attribute Function onloadedmetadata;
           attribute Function onloadeddata;
           attribute Function oncanplay;
           attribute Function oncanplaythrough;
           attribute Function onplaying;
           attribute Function onwaiting;

           attribute Function ondurationchange;
           attribute Function ontimeupdate;
           attribute Function onplay;
           attribute Function onpause;
           attribute Function onratechange;
           attribute Function onvolumechange;
};
controller = new MediaController()

Returns a new MediaController object.

media . controller [ = controller ]

Returns the current MediaController for the media element, if any; returns null otherwise.

Can be set, to set an explicit MediaController. Doing so removes the mediagroup attribute, if any.

controller . buffered

Returns a TimeRanges object that represents the intersection of the time ranges for which the user agent has all relevant media data for all the slaved media elements.

controller . seekable

Returns a TimeRanges object that represents the intersection of the time ranges into which the user agent can seek for all the slaved media elements.

controller . duration

Returns the difference between the earliest playable moment and the latest playable moment (not considering whether the data in question is actually buffered or directly seekable, but not including time in the future for infinite streams). Will return zero if there is no media.

controller . currentTime [ = value ]

Returns the current playback position, in seconds, as a position between zero time and the current duration.

Can be set, to seek to the given time.

controller . paused

Returns true if playback is paused; false otherwise. When this attribute is true, any media element slaved to this controller will be stopped.

controller . played

Returns a TimeRanges object that represents the union of the time ranges in all the slaved media elements that have been played.

controller . play()

Sets the paused attribute to false.

controller . pause()

Sets the paused attribute to true.

controller . defaultPlaybackRate [ = value ]

Returns the default rate of playback.

Can be set, to change the default rate of playback.

This default rate has no direct effect on playback, but if the user switches to a fast-forward mode, when they return to the normal playback mode, it is expected that rate of playback (playbackRate) will be returned to this default rate.

controller . playbackRate [ = value ]

Returns the current rate of playback.

Can be set, to change the rate of playback.

controller . volume [ = value ]

Returns the current playback volume multiplier, as a number in the range 0.0 to 1.0, where 0.0 is the quietest and 1.0 the loudest.

Can be set, to change the volume multiplier.

Throws an INDEX_SIZE_ERR if the new value is not in the range 0.0 .. 1.0.

controller . muted [ = value ]

Returns true if all audio is muted (regardless of other attributes either on the controller or on any media elements slaved to this controller), and false otherwise.

Can be set, to change whether the audio is muted or not.

A media element can have a current media controller, which is a MediaController object. When a media element is created without a mediagroup attribute, it does not have a current media controller. (If it is created with such an attribute, then that attribute initializes the current media controller, as defined below.)

The slaved media elements of a MediaController are the media elements whose current media controller is that MediaController. All the slaved media elements of a MediaController must use the same clock for their definition of their media timeline's unit time.


The controller attribute on a media element, on getting, must return the element's current media controller, if any, or null otherwise. On setting, it must first remove the element's mediagroup attribute, if any, and then set the current media controller to the given value. If the given value is null, the element no longer has a current media controller; if it is not null, then the user agent must bring the media element up to speed with its new media controller.


The MediaController() constructor, when invoked, must return a newly created MediaController object.


The seekable attribute must return a new static normalized TimeRanges object that represents the intersection of the ranges of the media resources of the slaved media elements that the user agent is able to seek to, at the time the attribute is evaluated.

The buffered attribute must return a new static normalized TimeRanges object that represents the intersection of the ranges of the media resources of the slaved media elements that the user agent has buffered, at the time the attribute is evaluated. Users agents must accurately determine the ranges available, even for media streams where this can only be determined by tedious inspection.

The duration attribute must return the media controller duration.

Every 15 to 250ms, or whenever the MediaController's media controller duration changes, whichever happens least often, the user agent must queue a task to fire a simple event named durationchange at the MediaController. If the MediaController's media controller duration decreases such that the media controller position is greater than the media controller duration, the user agent must immediately seek the media controller to media controller duration.

The currentTime attribute must return the media controller position on getting, and on setting must seek the media controller to the new value.

Every 15 to 250ms, or whenever the MediaController's media controller position changes, whichever happens least often, the user agent must queue a task to fire a simple event named timeupdate at the MediaController.


When a MediaController is created it is a playing media controller. It can be changed into a paused media controller and back either via the user agent's user interface (when the element is exposing a user interface to the user) or by script using the APIs defined in this section (see below).

The paused attribute must return true if the MediaController object is a paused media controller, and false otherwise.

The played attribute must return a new static normalized TimeRanges object that represents the union of the ranges of the media resources of the slaved media elements that the user agent has so far rendered, at the time the attribute is evaluated.

When the pause() method is invoked, if the MediaController is a playing media controller then the user agent must change the MediaController into a paused media controller, queue a task to fire a simple event named pause at the MediaController, and then report the controller state of the MediaController.

When the play() method is invoked, if the MediaController is a paused media controller, the user agent must change the MediaController into a playing media controller, queue a task to fire a simple event named play at the MediaController, and then report the controller state of the MediaController.


A MediaController has a media controller default playback rate and a media controller playback rate, which must both be set to 1.0 when the MediaController object is created.

The defaultPlaybackRate attribute, on getting, must return the MediaController's media controller default playback rate, and on setting, must set the MediaController's media controller default playback rate to the new value, then queue a task to fire a simple event named ratechange at the MediaController.

The playbackRate attribute, on getting, must return the MediaController's media controller playback rate, and on setting, must set the MediaController's media controller playback rate to the new value, then queue a task to fire a simple event named ratechange at the MediaController.


A MediaController has a media controller volume multiplier, which must be set to 1.0 when the MediaController object is created, and a media controller mute override, much must initially be false.

The volume attribute, on getting, must return the MediaController's media controller volume multiplier, and on setting, if the new value is in the range 0.0 to 1.0 inclusive, must set the MediaController's media controller volume multiplier to the new value and queue a task to fire a simple event named volumechange at the MediaController. If the new value is outside the range 0.0 to 1.0 inclusive, then, on setting, an INDEX_SIZE_ERR exception must be raised instead.

The muted attribute, on getting, must return the MediaController's media controller mute override, and on setting, must set the MediaController's media controller mute override to the new value and queue a task to fire a simple event named volumechange at the MediaController.


The media resources of all the slaved media elements of a MediaController have a defined temporal relationship which provides relative offsets between the zero time of each such media resource: for media resources with a timeline offset, their relative offsets are the difference between their timeline offset; the zero times of all the media resources without a timeline offset are not offset from each other (i.e. the origins of their timelines are cotemporal); and finally, the zero time of the media resource with the earliest timeline offset (if any) is not offset from the zero times of the media resources without a timeline offset (i.e. the origins of media resources without a timeline offset are further cotemporal with the earliest defined point on the timeline of the media resource with the earliest timeline offset).

The media resource end position of a media resource in a media element is defined as follows: if the media resource has a finite and known duration, the media resource end position is the duration of the media resource's timeline (the last defined position on that timeline); otherwise, the media resource's duration is infinite or unknown, and the media resource end position is the time of the last frame of media data currently available for that media resource.

Each MediaController also has its own defined timeline. On this timeline, all the media resources of all the slaved media elements of the MediaController are temporally aligned according to their defined offsets. The media controller duration of that MediaController is the time from the earliest earliest possible position, relative to this MediaController timeline, of any of the media resources of the slaved media elements of the MediaController, to the time of the latest media resource end position of the media resources of the slaved media elements of the MediaController, again relative to this MediaController timeline.

Each MediaController has a media controller position. This is the time on the MediaController's timeline at which the user agent is trying to play the slaved media elements. When a MediaController is created, its media controller position is initially zero.

When the user agent is to bring a media element up to speed with its new media controller, it must seek that media element to the MediaController's media controller position relative to the media element's timeline, discarding any resulting exceptions.

When the user agent is to seek the media controller to a particular new playback position, it must follow these steps:

  1. If the new playback position is less than zero, then set it to zero.

  2. If the new playback position is greater than the media controller duration, then set it to the media controller duration.

  3. Set the media controller position to the new playback position.

  4. Seek each slaved media element to the new playback position relative to the media element timeline, discarding any resulting exceptions.

A MediaController is a blocked media controller if the MediaController is a paused media controller, or if any of its slaved media elements are blocked media elements, or if any of its slaved media elements whose autoplaying flag is true still have a paused attribute that is true, or if all of its slaved media elements have their paused attribute set to true.

A media element is blocked on its media controller if the MediaController is a blocked media controller, or if its media controller position is either before the media resource's earliest possible position relative to the MediaController's timeline or after the end of the media resource relative to the MediaController's timeline.

When a MediaController is not a blocked media controller and it has at least one slaved media element whose Document is a fully active Document, the MediaController's media controller position must increase monotonically at media controller playback rate units of time on the MediaController's timeline per unit time of the clock used by its slaved media elements.

When the zero point on the timeline of a MediaController moves relative to the timelines of the slaved media elements by a time difference ΔT, the MediaController's media controller position must be decremented by ΔT.

In some situations, e.g. when playing back a live stream without buffering anything, the media controller position would increase motonically as described above at the same rate as the ΔT described in the previous paragraph decreases it, with the end result that for all intents and purposes, the media controller position would appear to remain constant (probably with the value 0).


A MediaController has a most recently reported readiness state, which is a number from 0 to 4 derived from the numbers used for the media element readyState attribute, and a most recently reported playback state, which is either playing, waiting, or ended.

When a MediaController is created, its most recently reported readiness state must be set to 0, and its most recently reported playback state must be set to waiting.

When a user agent is required to report the controller state for a MediaController, the user agent must run the following steps:

  1. If the MediaController has no slaved media elements, let new readiness state be 0.

    Otherwise, let it have the lowest value of the readyState IDL attributes of all of its slaved media elements.

  2. If the MediaController's most recently reported readiness state is not equal to new readiness state then queue a task to fire a simple event at the MediaController object, whose name is the event name corresponding to the value of new readiness state given in the table below:

    Value of new readiness state Event name
    0 emptied
    1 loadedmetadata
    2 loadeddata
    3 canplay
    4 canplaythrough
  3. Let the MediaController's most recently reported readiness state be new readiness state.

  4. Initialize new playback state by setting it to the state given for the first matching condition from the following list:

    If the MediaController has no slaved media elements
    Let new playback state be waiting.
    If all of the MediaController's slaved media elements have ended playback and the media controller playback rate is positive or zero
    Let new playback state be ended.
    If the MediaController is a blocked media controller
    Let new playback state be waiting.
    Otherwise
    Let new playback state be playing.
  5. If the MediaController's most recently reported playback state is not equal to new playback state then queue a task to fire a simple event at the MediaController object, whose name is playing if new playback state is playing, ended if new playback state is ended, and waiting otherwise.

  6. Let the MediaController's most recently reported playback state be new playback state.


The following are the event handlers that must be supported, as IDL attributes, by all objects implementing the MediaController interface:

Event handler Event handler event type
onemptied emptied
onloadedmetadata loadedmetadata
onloadeddata loadeddata
oncanplay canplay
oncanplaythrough canplaythrough
onplaying playing
onwaiting waiting
ondurationchange durationchange
ontimeupdate durationchange
onplay play
onpause pause
onratechange ratechange
onvolumechange volumechange

The task source for the tasks listed in this section is the DOM manipulation task source.

4.8.10.11.3 Assigning a media controller declaratively

The mediagroup content attribute on media elements can be used to link multiple media elements together by implicitly creating a MediaController.

When a media element is created with a mediagroup attribute, and when a media element's mediagroup attribute is set, changed, or removed, the user agent must run the following steps:

  1. Let m be the media element in question.

  2. Let m have no current media controller, if it currently has one.

  3. If m's mediagroup attribute is being removed, then abort these steps.

  4. If there is another media element whose Document is the same as m's Document (even if one or both of these elements are not actually in the Document), and which also has a mediagroup attribute, and whose mediagroup attribute has the same value as the new value of m's mediagroup attribute, then let controller be that media element's current media controller.

    Otherwise, let controller be a newly created MediaController.

  5. Let m's current media controller be controller.

  6. Bring the media element up to speed with its new media controller.

The mediaGroup IDL attribute on media elements must reflect the mediagroup content attribute.

Multiple media elements referencing the same media resource will share a single network request. This can be used to efficiently play two (video) tracks from the same media resource in two different places on the screen. Used with the mediagroup attribute, these elements can also be kept synchronised.

In this example, a sign-languge interpreter track from a movie file is overlaid on the primary video track of that same video file using two video elements, some CSS, and an implicit MediaController:

<article>
 <style scoped>
  div { margin: 1em auto; position: relative; width: 400px; height: 300px; }
  video { position; absolute; bottom: 0; right: 0; }
  video:first-child { width: 100%; height: 100%; }
  video:last-child { width: 30%; }
 </style>
 <div>
  <video src="movie.vid#track=Video&amp;track=English" autoplay controls mediagroup=movie></video>
  <video src="movie.vid#track=sign" autoplay mediagroup=movie></video>
 </div>
</article>
4.8.10.12 Timed text tracks
4.8.10.12.1 Text track model

A media element can have a group of associated text tracks, known as the media element's list of text tracks. The text tracks are sorted as follows:

  1. The text tracks corresponding to track element children of the media element, in tree order.
  2. Any text tracks added using the addTextTrack() method, in the order they were added, oldest first.
  3. Any media-resource-specific text tracks (text tracks corresponding to data in the media resource), in the order defined by the media resource's format specification.

A text track consists of:

The kind of text track

This decides how the track is handled by the user agent. The kind is represented by a string. The possible strings are:

The kind of track can change dynamically, in the case of a text track corresponding to a track element.

A label

This is a human-readable string intended to identify the track for the user. In certain cases, the label might be generated automatically.

The label of a track can change dynamically, in the case of a text track corresponding to a track element or in the case of an automatically-generated label whose value depends on variable factors such as the user's preferred user interface language.

A language

This is a string (a BCP 47 language tag) representing the language of the text track's cues. [BCP47]

The language of a text track can change dynamically, in the case of a text track corresponding to a track element.

A readiness state

One of the following:

Not loaded

Indicates that the text track is known to exist (e.g. it has been declared with a track element), but its cues have not been obtained.

Loading

Indicates that the text track is loading and there have been no fatal errors encountered so far. Further cues might still be added to the track.

Loaded

Indicates that the text track has been loaded with no fatal errors. No new cues will be added to the track except if the text track corresponds to a MutableTextTrack object.

Failed to load

Indicates that the text track was enabled, but when the user agent attempted to obtain it, this failed in some way (e.g. URL could not be resolved, network error, unknown text track format). Some or all of the cues are likely missing and will not be obtained.

The readiness state of a text track changes dynamically as the track is obtained.

A mode

One of the following:

Disabled

Indicates that the text track is not active. Other than for the purposes of exposing the track in the DOM, the user agent is ignoring the text track. No cues are active, no events are fired, and the user agent will not attempt to obtain the track's cues.

Hidden

Indicates that the text track is active, but that the user agent is not actively displaying the cues. If no attempt has yet been made to obtain the track's cues, the user agent will perform such an attempt momentarily. The user agent is maintaining a list of which cues are active, and events are being fired accordingly.

Showing
Showing by default

Indicates that the text track is active. If no attempt has yet been made to obtain the track's cues, the user agent will perform such an attempt momentarily. The user agent is maintaining a list of which cues are active, and events are being fired accordingly. In addition, for text tracks whose kind is subtitles or captions, the cues are being displayed over the video as appropriate; for text tracks whose kind is descriptions, the user agent is making the cues available to the user in a non-visual fashion; and for text tracks whose kind is chapters, the user agent is making available to the user a mechanism by which the user can navigate to any point in the media resource by selecting a cue.

The showing by default state is used in conjunction with the default attribute on track elements to indicate that the text track was enabled due to that attribute. This allows the user agent to override the state if a later track is discovered that is more appropriate per the user's preferences.

A list of zero or more cues

A list of text track cues, along with rules for updating the text track rendering.

The list of cues of a text track can change dynamically, either because the text track has not yet been loaded or is still loading, or because the text track corresponds to a MutableTextTrack object, whose API allows individual cues can be added or removed dynamically.

Each text track has a corresponding TextTrack object.

The text tracks of a media element are ready if all the text tracks whose mode was not in the disabled state when the element's resource selection algorithm last started now have a text track readiness state of loaded or failed to load.


A text track cue is the unit of time-sensitive data in a text track, corresponding for instance for subtitles and captions to the text that appears at a particular time and disappears at another time.

Each text track cue consists of:

An identifier

An arbitrary string.

A start time

A time, in seconds and fractions of a second, at which the cue becomes relevant.

An end time

A time, in seconds and fractions of a second, at which the cue stops being relevant.

A pause-on-exit flag

A boolean indicating whether playback of the media resource is to pause when the cue stops being relevant.

A writing direction

A writing direction, either horizontal (a line extends horizontally and is positioned vertically, with consecutive lines displayed below each other), vertical growing left (a line extends vertically and is positioned horizontally, with consecutive lines displayed to the left of each other), or vertical growing right (a line extends vertically and is positioned horizontally, with consecutive lines displayed to the right of each other).

A size

A number giving the size of the box within which the text of each line of the cue is to be aligned, to be interpreted as a percentage of the video, as defined by the writing direction.

The text of the cue

The raw text of the cue, and rules for its interpretation, allowing the text to be rendered and converted to a DOM fragment.

A text track cue is immutable.

Each text track cue has a corresponding TextTrackCue object, and can be associated with a particular text track. Once a text track cue is associated with a particular text track, the association is permanent.

In addition, each text track cue has two pieces of dynamic information:

The active flag

This flag must be initially unset. The flag is used to ensure events are fired appropriately when the cue becomes active or inactive, and to make sure the right cues are rendered.

The user agent must synchronously unset this flag whenever the text track cue is removed from its text track's text track list of cues; whenever the text track itself is removed from its media element's list of text tracks or has its text track mode changed to disabled; and whenever the media element's readyState is changed back to HAVE_NOTHING. When the flag is unset in this way for one or more cues in text tracks that were showing or showing by default prior to the relevant incident, the user agent must, after having unset the flag for all the affected cues, apply the rules for updating the text track rendering of those text tracks.

The display state

This is used as part of the rendering model, to keep cues in a consistent position. It must initially be empty. Whenever the text track cue active flag is unset, the user agent must empty the text track cue display state.

The text track cues of a media element's text tracks are ordered relative to each other in the text track cue order, which is determined as follows: first group the cues by their text track, with the groups being sorted in the same order as their text tracks appear in the media element's list of text tracks; then, within each group, cues must be sorted by their start time, earliest first; then, any cues with the same start time must be sorted by their end time, earliest first; and finally, any cues with identical end times must be sorted in the order they were created (so e.g. for cues from a WebVTT file, that would be the order in which the cues were listed in the file).

4.8.10.12.2 Sourcing in-band text tracks

A media-resource-specific text track is a text track that corresponds to data found in the media resource.

Rules for processing and rendering such data are defined by the relevant specifications, e.g. the specification of the video format if the media resource is a video.

When a media resource contains data that the user agent recognises and supports as being equivalent to a text track, the user agent runs the steps to expose a media-resource-specific text track with the relevant data, as follows:

  1. Associate the relevant data with a new text track and its corresponding new TextTrack object. The text track is a media-resource-specific text track.

  2. Set the new text track's kind, label, and language based on the semantics of the relevant data, as defined by the relevant specification.

  3. Populate the new text track's list of cues with the cues parsed so far, folllowing the guidelines for exposing cues, and begin updating it dynamically as necessary.

  4. Set the new text track's readiness state to the value that most correctly describes the current state, and begin updating it dynamically as necessary.

    For example, if the relevant data in the media resource has been fully parsed and completely describes the cues, then the text track would be loaded. On the other hand, if the data for the cues is interleaved with the media data, and the media resource as a whole is still being downloaded, then the loading state might be more accurate.

  5. Set the new text track's mode to the mode consistent with the user's preferences and the requirements of the relevant specification for the data.

  6. Leave the text track list of cues empty, and associate with it the rules for updating the text track rendering appropriate for the format in question.

  7. Add the new text track to the media element's list of text tracks.

When a media element is to forget the media element's media-resource-specific text tracks, the user agent must remove from the media element's list of text tracks all the media-resource-specific text tracks.

4.8.10.12.3 Sourcing out-of-band text tracks

When a track element is created, it must be associated with a new text track (with its value set as defined below) and its corresponding new TextTrack object.

The text track kind is determined from the state of the element's kind attribute according to the following table; for a state given in a cell of the first column, the kind is the string given in the second column:

State String
Subtitles subtitles
Captions captions
Descriptions descriptions
Chapters chapters
Metadata metadata

The text track label is the element's track label.

The text track language is the element's track language, if any, or the empty string otherwise.

As the kind, label, and srclang attributes are set, changed, or removed, the text track must update accordingly, as per the definitions above.

Changes to the track URL are handled in the algorithm below.

The text track list of cues is initially empty. It is dynamically modified when the referenced file is parsed. Associated with the list are the rules for updating the text track rendering appropriate for the format in question; for WebVTT, this is the rules for updating the display of WebVTT text tracks.

When a track element's parent element changes and the new parent is a media element, then the user agent must add the track element's corresponding text track to the media element's list of text tracks.

When a track element's parent element changes and the old parent was a media element, then the user agent must remove the track element's corresponding text track from the media element's list of text tracks.

When a text track corresponding to a track element is added to a media element's list of text tracks, the user agent must set the text track mode appropriately, as determined by the following conditions:

If the text track kind is subtitles or captions and the user has indicated an interest in having a track with this text track kind, text track language, and text track label enabled, and there is no other text track in the media element's list of text tracks with a text track kind of either subtitles or captions whose text track mode is showing
If the text track kind is descriptions and the user has indicated an interest in having text descriptions with this text track language and text track label enabled, and there is no other text track in the media element's list of text tracks with a text track kind of descriptions whose text track mode is showing
If the text track kind is chapters and the text track language is one that the user agent has reason to believe is appropriate for the user, and there is no other text track in the media element's list of text tracks with a text track kind of chapters whose text track mode is showing

Let the text track mode be showing.

If there is a text track in the media element's list of text tracks whose text track mode is showing by default, the user agent must furthermore change that text track's text track mode to hidden.

If the track element has a default attribute specified, and there is no other text track in the media element's list of text tracks whose text track mode is showing or showing by default

Let the text track mode be showing by default.

Otherwise

Let the text track mode be disabled.

When a text track corresponding to a track element is created with text track mode set to hidden, showing, or showing by default, and when a text track corresponding to a track element is created with text track mode set to disabled and subsequently changes its text track mode to hidden, showing, or showing by default for the first time, the user agent must immediately and synchronously run the following algorithm. This algorithm interacts closely with the event loop mechanism; in particular, it has a synchronous section (which is triggered as part of the event loop algorithm). The step in that section is marked with ⌛.

  1. Set the text track readiness state to loading.

  2. Let URL be the track URL of the track element.

  3. Asynchronously run the remaining steps, while continuing with whatever task was responsible for creating the text track or changing the text track mode.

  4. Download: If URL is not the empty string, and its origin is the same as the media element's Document's origin, then fetch URL, from the media element's Document's origin, with the force same-origin flag set.

    The tasks queued by the fetching algorithm on the networking task source to process the data as it is being fetched must examine the resource's Content Type metadata, once it is available, if it ever is. If no Content Type metadata is ever available, or if the type is not recognised as a text track format, then the resource's format must be assumed to be unsupported (this causes the load to fail, as described below). If a type is obtained, and represents a supported text track format, then the resource's data must be passed to the appropriate parser as it is received, with the text track list of cues being used for that parser's output.

    If the fetching algorithm fails for any reason (network error, the server returns an error code, a cross-origin check fails, etc), or if URL is the empty string or has the wrong origin as determined by the condition at the start of this step, or if the fetched resource is not in a supported format, then queue a task to first change the text track readiness state to failed to load and then fire a simple event named error at the track element; and then, once that task is queued, move on to the step below labeled monitoring.

    If the fetching algorithm does not fail, then, when it completes, queue a task to first change the text track readiness state to loaded and then fire a simple event named load at the track element; and then, once that task is queued, move on to the step below labeled monitoring.

    If, while the fetching algorithm is active, either:

    ...then the user agent must run the following steps:

    1. Abort the fetching algorithm.

    2. Queue a task to fire a simple event named abort at the track element.

    3. Let URL be the new track URL.

    4. Jump back to the top of the step labeled download.

    Until one of the above circumstances occurs, the user agent must remain on this step.

  5. Monitoring: Wait until the track URL is no longer equal to URL, at the same time as the text track mode is set to hidden, showing, or showing by default.

  6. Wait until the text track readiness state is no longer set to loading.

  7. Await a stable state. The synchronous section consists of the following step. (The step in the synchronous section is marked with ⌛.)

  8. ⌛ Set the text track readiness state to loading.

  9. End the synchronous section, continuing the remaining steps asynchronously.

  10. Jump to the step labeled download.

4.8.10.12.4 Text track API
media . textTracks . length

Returns the number of text tracks associated with the media element (e.g. from track elements). This is the number of text tracks in the media element's list of text tracks.

media . textTracks[ n ]

Returns the TextTrack object representing the nth text track in the media element's list of text tracks.

track . track

Returns the TextTrack object representing the track element's text track.

The textTracks attribute of media elements must return an array host object for objects of type TextTrack that is fixed length and read only. The same object must be returned each time the attribute is accessed. [WEBIDL]

The array must contain the TextTrack objects of the text tracks in the media element's list of text tracks, in the same order as in the list of text tracks.


interface TextTrack {
  readonly attribute DOMString kind;
  readonly attribute DOMString label;
  readonly attribute DOMString language;

  const unsigned short NONE = 0;
  const unsigned short LOADING = 1;
  const unsigned short LOADED = 2;
  const unsigned short ERROR = 3;
  readonly attribute unsigned short readyState;
           attribute Function onload;
           attribute Function onerror;

  const unsigned short OFF = 0;
  const unsigned short HIDDEN = 1;
  const unsigned short SHOWING = 2;
           attribute unsigned short mode;

  readonly attribute TextTrackCueList cues;
  readonly attribute TextTrackCueList activeCues;

           attribute Function oncuechange;
};
TextTrack implements EventTarget;
textTrack . kind

Returns the text track kind string.

textTrack . label

Returns the text track label.

textTrack . language

Returns the text track language string.

textTrack . readyState

Returns the text track readiness state, represented by a number from the following list:

TextTrack . NONE (0)

The text track not loaded state.

TextTrack . LOADING (1)

The text track loading state.

TextTrack . LOADED (2)

The text track loaded state.

TextTrack . ERROR (3)

The text track failed to load state.

textTrack . mode

Returns the text track mode, represented by a number from the following list:

TextTrack . OFF (0)

The text track disabled mode.

TextTrack . HIDDEN (1)

The text track hidden mode.

TextTrack . SHOWING (2)

The text track showing and showing by default modes.

Can be set, to change the mode.

textTrack . cues

Returns the text track list of cues, as a TextTrackCueList object.

textTrack . activeCues

Returns the text track cues from the text track list of cues that are currently active (i.e. that start before the current playback position and end after it), as a TextTrackCueList object.

The kind attribute must return the text track kind of the text track that the TextTrack object represents.

The label attribute must return the text track label of the text track that the TextTrack object represents.

The language attribute must return the text track language of the text track that the TextTrack object represents.

The readyState attribute must return the numeric value corresponding to the text track readiness state of the text track that the TextTrack object represents, as defined by the following list:

NONE (numeric value 0)
The text track not loaded state.
LOADING (numeric value 1)
The text track loading state.
LOADED (numeric value 2)
The text track loaded state.
ERROR (numeric value 3)
The text track failed to load state.

The mode attribute, on getting, must return the numeric value corresponding to the text track mode of the text track that the TextTrack object represents, as defined by the following list:

OFF (numeric value 0)
The text track disabled mode.
HIDDEN (numeric value 1)
The text track hidden mode.
SHOWING (numeric value 2)
The text track showing and showing by default modes.

On setting, if the new value is not either 0, 1, or 2, the user agent must throw an INVALID_ACCESS_ERR exception. Otherwise, if the new value isn't equal to what the attribute would currently return, the new value must be processed as follows:

If the new value is 0

Set the text track mode of the text track that the TextTrack object represents to the text track disabled mode.

If the new value is 1

Set the text track mode of the text track that the TextTrack object represents to the text track hidden mode.

If the new value is 2

Set the text track mode of the text track that the TextTrack object represents to the text track showing mode.

If the mode had been showing by default, this will change it to showing, even though the value of mode would appear not to change.

If the text track mode of the text track that the TextTrack object represents is not the text track disabled mode, then the cues attribute must return a live TextTrackCueList object that represents the subset of the text track list of cues of the text track that the TextTrack object represents whose start times occur before the earliest possible position when the script started, in text track cue order. Otherwise, it must return null. When an object is returned, the same object must be returned each time.

The earliest possible position when the script started is whatever the earliest possible position was the last time the event loop reached step 1.

If the text track mode of the text track that the TextTrack object represents is not the text track disabled mode, then the activeCues attribute must return a live TextTrackCueList object that represents the subset of the text track list of cues of the text track that the TextTrack object represents whose active flag was set when the script started, in text track cue order. Otherwise, it must return null. When an object is returned, the same object must be returned each time.

A text track cue's active flag was set when the script started if its text track cue active flag was set the last time the event loop reached step 1.


interface MutableTextTrack : TextTrack {
 void addCue(in TextTrackCue cue);
 void removeCue(in TextTrackCue cue);
};
mutableTextTrack = media . addTextTrack( kind [, label [, language ] ] )

Creates and returns a new MutableTextTrack object, which is also added to the media element's list of text tracks.

mutableTextTrack . addCue( cue )

Adds the given cue to mutableTextTrack's text track list of cues.

Raises an exception if the argument is null, associated with another text track, or already in the list of cues.

mutableTextTrack . removeCue( cue )

Removes the given cue from mutableTextTrack's text track list of cues.

Raises an exception if the argument is null, associated with another text track, or not in the list of cues.

The addTextTrack(kind, label, language) method of media elements, when invoked, must run the following steps:

  1. If kind is not one of the following strings, then throw a SYNTAX_ERR exception and abort these steps:

  2. If the label argument was omitted, let label be the empty string.

  3. If the language argument was omitted, let language be the empty string.

  4. Create a new text track, and set its text track kind to kind, its text track label to label, its text track language to language, its text track readiness state to the text track loaded state, its text track mode to the text track hidden mode, and its text track list of cues to an empty list.

  5. Add the new text track to the media element's list of text tracks.

The addCue(cue) method of MutableTextTrack objects, when invoked, must run the following steps:

  1. If cue is null, then throw an INVALID_ACCESS_ERR exception and abort these steps.

  2. If the given cue is already associated with a text track other than the method's MutableTextTrack object's text track, then throw an INVALID_STATE_ERR exception and abort these steps.

  3. Associate cue with the method's MutableTextTrack object's text track, if it is not currently associated with a text track.

  4. If the given cue is already listed in the method's MutableTextTrack object's text track's text track list of cues, then throw an INVALID_STATE_ERR exception.

  5. Add cue to the method's MutableTextTrack object's text track's text track list of cues.

The removeCue(cue) method of MutableTextTrack objects, when invoked, must run the following steps:

  1. If cue is null, then throw an INVALID_ACCESS_ERR exception and abort these steps.

  2. If the given cue is not associated with the method's MutableTextTrack object's text track, then throw an INVALID_STATE_ERR exception.

  3. If the given cue is not currently listed in the method's MutableTextTrack object's text track's text track list of cues, then throw a NOT_FOUND_ERR exception.

  4. Remove cue from the method's MutableTextTrack object's text track's text track list of cues.

In this example, an audio element is used to play a specific sound-effect from a sound file containing many sound effects. A cue is used to pause the audio, so that it ends exactly at the end of the clip, even if the browser is busy running some script. If the page had relied on script to pause the audio, then the start of the next clip might be heard if the browser was not able to run the script at the exact time specified.

var sfx = new Audio('sfx.wav');
var sounds = a.addTextTrack('metadata');

// add sounds we care about
sounds.addCue(new TextTrackCue('dog bark', 12.783, 13.612, '', '', '', true));
sounds.addCue(new TextTrackCue('kitten mew', 13.612, 15.091, '', '', '', true));

function playSound(id) {
  sfx.currentTime = sounds.getCueById(id).startTime;
  sfx.play();
}

sfx.oncanplaythrough = function () {
  playSound('dog bark');
}
window.onbeforeunload = function () {
  playSound('kitten mew');
  return 'Are you sure you want to leave this awesome page?';
}

interface TextTrackCueList {
  readonly attribute unsigned long length;
  getter TextTrackCue (in unsigned long index);
  TextTrackCue getCueById(in DOMString id);
};
cuelist . length

Returns the number of cues in the list.

cuelist[index]

Returns the text track cue with index index in the list. The cues are sorted in text track cue order.

cuelist . getCueById( id )

Returns the first text track cue (in text track cue order) with text track cue identifier id.

Returns null if none of the cues have the given identifier or if the argument is the empty string.

A TextTrackCueList object represents a dynamically updating list of text track cues in a given order.

The length attribute must return the number of cues in the list represented by the TextTrackCueList object.

The supported property indicies of a TextTrackCueList object at any instant are the numbers from zero to the number of cues in the list represented by the TextTrackCueList object minus one, if any. If there are no cues in the list, there are no supported property indicies.

To determine the value of an indexed property for a given index index, the user agent must return the indexth text track cue in the list represented by the TextTrackCueList object.

The getCueById(id) method, when called with an argument other than the empty string, must return the first text track cue in the list represented by the TextTrackCueList object whose text track cue identifier is id, if any, or null otherwise. If the argument is the empty string, then the method must return null.


interface TextTrackCue {
  readonly attribute TextTrack track;
  readonly attribute DOMString id;

  readonly attribute double startTime;
  readonly attribute double endTime;
  readonly attribute boolean pauseOnExit;


  DOMString getCueAsSource();
  DocumentFragment getCueAsHTML();

           attribute Function onenter;
           attribute Function onexit;
};
TextTrackCue implements EventTarget;
cue . track

Returns the TextTrack object to which this text track cue belongs, if any, or null otherwise.

cue . id

Returns the text track cue identifier.

cue . startTime

Returns the text track cue start time, in seconds.

cue . endTime

Returns the text track cue end time, in seconds.

cue . pauseOnExit

Returns true if the text track cue pause-on-exit flag is set, false otherwise.

source = cue . getCueAsSource()

Returns the text track cue text in raw unparsed form.

fragment = cue . getCueAsHTML()

Returns the text track cue text as a DocumentFragment of HTML elements and other DOM nodes.

The track attribute must return the TextTrack object of the text track with which the text track cue that the TextTrackCue object represents is associated, if any; or null otherwise.

The id attribute must return the text track cue identifier of the text track cue that the TextTrackCue object represents.

The startTime attribute must return the text track cue start time of the text track cue that the TextTrackCue object represents, in seconds.

The endTime attribute must return the text track cue end time of the text track cue that the TextTrackCue object represents, in seconds.

The pauseOnExit attribute must return true if the text track cue pause-on-exit flag of the text track cue that the TextTrackCue object represents is set; or false otherwise.

The direction attribute must return the text track cue writing direction of the text track cue that the TextTrackCue object represents.

The getCueAsSource() method must return the raw text track cue text.

The getCueAsHTML() method must convert the text track cue text to a DocumentFragment for the media element's Document, using the appropriate rules for doing so.

4.8.10.12.5 Event definitions

The following are the event handlers that must be supported, as IDL attributes, by all objects implementing the TextTrack interface:

Event handler Event handler event type
onload load
onerror error
oncuechange cuechange

The following are the event handlers that must be supported, as IDL attributes, by all objects implementing the TextTrackCue interface:

Event handler Event handler event type
onenter enter
onexit exit
4.8.10.13 User interface

The controls attribute is a boolean attribute. If present, it indicates that the author has not provided a scripted controller and would like the user agent to provide its own set of controls.

If the attribute is present, or if scripting is disabled for the media element, then the user agent should expose a user interface to the user. This user interface should include features to begin playback, pause playback, seek to an arbitrary position in the content (if the content supports arbitrary seeking), change the volume, change the display of closed captions or embedded sign-language tracks, select different audio tracks or turn on audio descriptions, and show the media content in manners more suitable to the user (e.g. full-screen video or in an independent resizable window). Other controls may also be made available.

If the media element has a current media controller, then the user agent should expose audio tracks from all the slaved media elements (although avoiding duplicates if the same media resource is being used several times). If a media resource's audio track exposed in this way has no known name, and it is the only audio track for a particular media element, the user agent should use the element's title attribute, if any, as the name (or as part of the name) of that track.

Even when the attribute is absent, however, user agents may provide controls to affect playback of the media resource (e.g. play, pause, seeking, and volume controls), but such features should not interfere with the page's normal rendering. For example, such features could be exposed in the media element's context menu.

Where possible (specifically, for starting, stopping, pausing, and unpausing playback, for seeking, for changing the rate of playback, for fast-forwarding or rewinding, for listing, enabling, and disabling text tracks, and for muting or changing the volume of the audio), user interface features exposed by the user agent must be implemented in terms of the DOM API described above, so that, e.g., all the same events fire.

When a media element has a current media controller, the user agent's user interface for pausing and unpausing playback, for seeking, for changing the rate of playback, for fast-forwarding or rewinding, and for muting or changing the volume of audio of the entire group must be implemented in terms of the MediaController API exposed on that current media controller.

The "play" function in the user agent's interface must set the playbackRate attribute to the value of the defaultPlaybackRate attribute before invoking the play() method. When a media element has a current media controller, the attributes and method with those names on that MediaController object must be used. Otherwise, the attributes and method with those names on the media element itself must be used.

Features such as fast-forward or rewind must be implemented by only changing the playbackRate attribute (and not the defaultPlaybackRate attribute). Again, when a media element has a current media controller, the attributes with those names on that MediaController object must be used; otherwise, the attributes with those names on the media element itself must be used.

When a media element has a current media controller, and all the slaved media elements of that MediaController are paused, the user agent should unpause all the slaved media elements when the user invokes a user agent interface control for beginning playback.

When a media element has a current media controller, seeking must be implemented in terms of the seek() method on that MediaController object. Otherwise, the user agent must directly seek to the requested position in the media element's media timeline.

When a media element has a current media controller, user agents may additionally provide the user with controls that directly manipulate an individual media element without affecting the MediaController, but such features are considered relatively advanced and unlikely to be useful to most users.

For the purposes of listing chapters in the media resource, only text tracks in the media element's list of text tracks showing or showing by default and whose text track kind is chapters should be used. Each cue in such a text track represents a chapter starting at the cue's start time. The name of the chapter is the text track cue text, interpreted literally.

The controls IDL attribute must reflect the content attribute of the same name.


media . volume [ = value ]

Returns the current playback volume, as a number in the range 0.0 to 1.0, where 0.0 is the quietest and 1.0 the loudest.

Can be set, to change the volume.

Throws an INDEX_SIZE_ERR if the new value is not in the range 0.0 .. 1.0.

media . muted [ = value ]

Returns true if audio is muted, overriding the volume attribute, and false if the volume attribute is being honored.

Can be set, to change whether the audio is muted or not.

The volume attribute must return the playback volume of any audio portions of the media element, in the range 0.0 (silent) to 1.0 (loudest). Initially, the volume must be 1.0, but user agents may remember the last set value across sessions, on a per-site basis or otherwise, so the volume may start at other values. On setting, if the new value is in the range 0.0 to 1.0 inclusive, the attribute must be set to the new value. If the new value is outside the range 0.0 to 1.0 inclusive, then, on setting, an INDEX_SIZE_ERR exception must be raised instead.

The muted attribute must return true if the audio channels are muted and false otherwise. Initially, the audio channels should not be muted (false), but user agents may remember the last set value across sessions, on a per-site basis or otherwise, so the muted state may start as muted (true). On setting, the attribute must be set to the new value.

Whenever either the muted or volume attributes are changed, the user agent must queue a task to fire a simple event named volumechange at the media element.

An element's effective media volume is determined as follows:

  1. If the element's muted attribute is true, the element's effective media volume is zero. Abort these steps.

  2. If the element has a current media controller and that MediaController object's media controller mute override is true, the element's effective media volume is zero. Abort these steps.

  3. Let volume be the value of the element's volume attribute.

  4. If the element has a current media controller, multiply volume by that MediaController object's media controller volume multiplier.

  5. The element's effective media volume is volume, interpreted relative to the range 0.0 to 1.0, with 0.0 being silent, and 1.0 being the loudest setting, values in between increasing in loudness. The range need not be linear. The loudest setting may be lower than the system's loudest possible setting; for example the user could have set a maximum volume.

The muted attribute on the video element controls the default state of the audio channel of the media resource, potentially overriding user preferences.

When a media element is created, if it has a muted attribute specified, the user agent must set the muted IDL attribute to true, overriding any user preference.

The defaultMuted IDL attribute must reflect the muted content attribute.

This attribute has no dynamic effect (it only controls the default state of the element).

This video (an advertisment) autoplays, but to avoid annoying users, it does so without sound, and allows the user to turn the sound on.

<video src="adverts.cgi?kind=video" controls autoplay loop muted></video>
4.8.10.14 Time ranges

Objects implementing the TimeRanges interface represent a list of ranges (periods) of time.

interface TimeRanges {
  readonly attribute unsigned long length;
  double start(in unsigned long index);
  double end(in unsigned long index);
};
media . length

Returns the number of ranges in the object.

time = media . start(index)

Returns the time for the start of the range with the given index.

Throws an INDEX_SIZE_ERR if the index is out of range.

time = media . end(index)

Returns the time for the end of the range with the given index.

Throws an INDEX_SIZE_ERR if the index is out of range.

The length IDL attribute must return the number of ranges represented by the object.

The start(index) method must return the position of the start of the indexth range represented by the object, in seconds measured from the start of the timeline that the object covers.

The end(index) method must return the position of the end of the indexth range represented by the object, in seconds measured from the start of the timeline that the object covers.

These methods must raise INDEX_SIZE_ERR exceptions if called with an index argument greater than or equal to the number of ranges represented by the object.

When a TimeRanges object is said to be a normalized TimeRanges object, the ranges it represents must obey the following criteria:

In other words, the ranges in such an object are ordered, don't overlap, aren't empty, and don't touch (adjacent ranges are folded into one bigger range).

The timelines used by the objects returned by the buffered, seekable and played IDL attributes of media elements must be that element's media timeline.

4.8.10.15 Event summary

This section is non-normative.

The following events fire on media elements as part of the processing model described above:

Event name Interface Dispatched when... Preconditions
loadstart Event The user agent begins looking for media data, as part of the resource selection algorithm. networkState equals NETWORK_LOADING
progress Event The user agent is fetching media data. networkState equals NETWORK_LOADING
suspend Event The user agent is intentionally not currently fetching media data, but does not have the entire media resource downloaded. networkState equals NETWORK_IDLE
abort Event The user agent stops fetching the media data before it is completely downloaded, but not due to an error. error is an object with the code MEDIA_ERR_ABORTED. networkState equals either NETWORK_EMPTY or NETWORK_IDLE, depending on when the download was aborted.
error Event An error occurs while fetching the media data. error is an object with the code MEDIA_ERR_NETWORK or higher. networkState equals either NETWORK_EMPTY or NETWORK_IDLE, depending on when the download was aborted.
emptied Event A media element whose networkState was previously not in the NETWORK_EMPTY state has just switched to that state (either because of a fatal error during load that's about to be reported, or because the load() method was invoked while the resource selection algorithm was already running). networkState is NETWORK_EMPTY; all the IDL attributes are in their initial states.
stalled Event The user agent is trying to fetch media data, but data is unexpectedly not forthcoming. networkState is NETWORK_LOADING.
loadedmetadata Event The user agent has just determined the duration and dimensions of the media resource and the text tracks are ready. readyState is newly equal to HAVE_METADATA or greater for the first time.
loadeddata Event The user agent can render the media data at the current playback position for the first time. readyState newly increased to HAVE_CURRENT_DATA or greater for the first time.
canplay Event The user agent can resume playback of the media data, but estimates that if playback were to be started now, the media resource could not be rendered at the current playback rate up to its end without having to stop for further buffering of content. readyState newly increased to HAVE_FUTURE_DATA or greater.
canplaythrough Event The user agent estimates that if playback were to be started now, the media resource could be rendered at the current playback rate all the way to its end without having to stop for further buffering. readyState is newly equal to HAVE_ENOUGH_DATA.
playing Event Playback is ready to start after having been paused or delayed due to lack of media data. readyState is newly equal to or greater than HAVE_FUTURE_DATA and paused is false, or paused is newly false and readyState is equal to or greater than HAVE_FUTURE_DATA. Even if this event fires, the element might still not be potentially playing, e.g. if the element is blocked on its media controller (e.g. because the current media controller is paused, or another slaved media element is stalled somehow, or because the media resource has no data corresponding to the media controller position), or the element is paused for user interaction.
waiting Event Playback has stopped because the next frame is not available, but the user agent expects that frame to become available in due course. readyState is equal to or less than HAVE_CURRENT_DATA, and paused is false. Either seeking is true, or the current playback position is not contained in any of the ranges in buffered. It is possible for playback to stop for other reasons without paused being false, but those reasons do not fire this event (and when those situations resolve, a separate playing event is not fired either): e.g. the element is newly blocked on its media controller, or playback ended, or playback stopped due to errors, or the element has paused for user interaction.
seeking Event The seeking IDL attribute changed to true and the seek operation is taking long enough that the user agent has time to fire the event.
seeked Event The seeking IDL attribute changed to false.
ended Event Playback has stopped because the end of the media resource was reached. currentTime equals the end of the media resource; ended is true.
durationchange Event The duration attribute has just been updated.
timeupdate Event The current playback position changed as part of normal playback or in an especially interesting way, for example discontinuously.
play Event The element is no longer paused. Fired after the play() method has returned, or when the autoplay attribute has caused playback to begin. paused is newly false.
pause Event The element has been paused. Fired after the pause() method has returned. paused is newly true.
ratechange Event Either the defaultPlaybackRate or the playbackRate attribute has just been updated.
volumechange Event Either the volume attribute or the muted attribute has changed. Fired after the relevant attribute's setter has returned.

The following events fire on MediaController objects:

Event name Interface Dispatched when...
emptied Event All the slaved media elements newly have readyState set to HAVE_NOTHING or greater, or there are no longer any slaved media elements.
loadedmetadata Event All the slaved media elements newly have readyState set to HAVE_METADATA or greater.
loadeddata Event All the slaved media elements newly have readyState set to HAVE_CURRENT_DATA or greater.
canplay Event All the slaved media elements newly have readyState set to HAVE_FUTURE_DATA or greater.
canplaythrough Event All the slaved media elements newly have readyState set to HAVE_ENOUGH_DATA or greater.
playing Event The MediaController is no longer a blocked media controller.
waiting Event The MediaController is now a blocked media controller.
ended Event All the slaved media elements have newly ended playback.
durationchange Event The duration attribute has just been updated.
timeupdate Event The media controller position changed.
play Event The paused attribute is newly false.
pause Event The paused attribute is newly true.
ratechange Event Either the defaultPlaybackRate attribute or the playbackRate attribute has just been updated.
volumechange Event Either the volume attribute or the muted attribute has just been updated.
4.8.10.16 Security and privacy considerations

The main security and privacy implications of the video and audio elements come from the ability to embed media cross-origin. There are two directions that threats can flow: from hostile content to a victim page, and from a hostile page to victim content.


If a victim page embeds hostile content, the threat is that the content might contain scripted code that attempts to interact with the Document that embeds the content. To avoid this, user agents must ensure that there is no access from the content to the embedding page. In the case of media content that uses DOM concepts, the embedded content must be treated as if it was in its own unrelated top-level browsing context.

For instance, if an SVG animation was embedded in a video element, the user agent would not give it access to the DOM of the outer page. From the perspective of scripts in the SVG resource, the SVG file would appear to be in a lone top-level browsing context with no parent.


If a hostile page embeds victim content, the threat is that the embedding page could obtain information from the content that it would not otherwise have access to. The API does expose some information: the existence of the media, its type, its duration, its size, and the performance characteristics of its host. Such information is already potentially problematic, but in practice the same information can more or less be obtained using the img element, and so it has been deemed acceptable.

However, significantly more sensitive information could be obtained if the user agent further exposes metadata within the content such as subtitles or chapter titles. This version of the API does not expose such information. Future extensions to this API will likely reuse a mechanism such as CORS to check that the embedded content's site has opted in to exposing such information. [CORS]

An attacker could trick a user running within a corporate network into visiting a site that attempts to load a video from a previously leaked location on the corporation's intranet. If such a video included confidential plans for a new product, then being able to read the subtitles would present a confidentiality breach.

4.8.10.17 Best practices for authors using media elements

This section is non-normative.

Playing audio and video resources on small devices such as set-top boxes or mobile phones is often constrained by limited hardware resources in the device. For example, a device might only support three simultaneous videos. For this reason, it is a good practice to release resources held by media elements when they are done playing, either by being very careful about removing all references to the element and allowing it to be garbage collected, or, even better, by removing the element's src attribute and any source element descendants, and invoking the element's load() method.

Similarly, when the playback rate is not exactly 1.0, hardware, software, or format limitations can cause video frames to be dropped and audio to be choppy or muted.

4.8.10.18 Best practices for implementors of media elements

This section is non-normative.

How accurately various aspects of the media element API are implemented is considered a quality-of-implementation issue.

For example, when implementing the buffered attribute, how precise an implementation reports the ranges that have been buffered depends on how carefully the user agent inspects the data. Since the API reports ranges as times, but the data is obtained in byte streams, a user agent receiving a variable-bit-rate stream might only be able to determine precise times by actually decoding all of the data. User agents aren't required to do this, however; they can instead return estimates (e.g. based on the average bit rate seen so far) which get revised as more information becomes available.

As a general rule, user agents are urged to be conservative rather than optimistic. For example, it would be bad to report that everything had been buffered when it had not.

Another quality-of-implementation issue would be playing a video backwards when the codec is designed only for forward playback (e.g. there aren't many key frames, and they are far apart, and the intervening frames only have deltas from the previous frame). User agents could do a poor job, e.g. only showing key frames; however, better implementations would do more work and thus do a better job, e.g. actually decoding parts of the video forwards, storing the complete frames, and then playing the frames backwards.

Similarly, while implementations are allowed to drop buffered data at any time (there is no requirement that a user agent keep all the media data obtained for the lifetime of the media element), it is again a quality of implementation issue: user agents with sufficient resources to keep all the data around are encouraged to do so, as this allows for a better user experience. For example, if the user is watching a live stream, a user agent could allow the user only to view the live video; however, a better user agent would buffer everything and allow the user to seek through the earlier material, pause it, play it forwards and backwards, etc.

When multiple tracks are synchronised with a MediaController, it is possible for scripts to add and remove media elements from the MediaController's list of slaved media elements, even while these tracks are playing. How smoothly the media plays back in such situations is another quality-of-implementation issue.


When a media element that is paused is removed from a document and not reinserted before the next time the event loop spins, implementations that are resource constrained are encouraged to take that opportunity to release all hardware resources (like video planes, networking resources, and data buffers) used by the media element. (User agents still have to keep track of the playback position and so forth, though, in case playback is later restarted.)