Media Capture and Streams

Initial Author of this Specification was Ian Hickson, Google Inc., with the following copyright statement:
© Copyright 2004-2011 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA. You are granted a license to use, reproduce and create derivative works of this document.

MediaStream API

Introduction

The two main components in the MediaStream API are the {{MediaStreamTrack}} and {{MediaStream}} interfaces. The {{MediaStreamTrack}} object represents media of a single type that originates from one media source in the [=User Agent=], e.g. video produced by a web camera. A {{MediaStream}} is used to group several {{MediaStreamTrack}} objects into one unit that can be recorded or rendered in a media element.

Each {{MediaStream}} can contain zero or more {{MediaStreamTrack}} objects. All tracks in a {{MediaStream}} are intended to be synchronized when rendered. This is not a hard requirement, since it might not be possible to synchronize tracks from sources that have different clocks. Different {{MediaStream}} objects do not need to be synchronized.

While the intent is to synchronize tracks, it could be better in some circumstances to permit tracks to lose synchronization. In particular, when tracks are remotely sourced and real-time [[?WEBRTC]], it can be better to allow loss of synchronization than to accumulate delays or risk glitches and other artifacts. Implementations are expected to understand the implications of choices regarding synchronization of playback and the effect that these have on user perception.

A single {{MediaStreamTrack}} can represent multi-channel content, such as stereo or 5.1 audio or stereoscopic video, where the channels have a well defined relationship to each other. Information about channels might be exposed through other APIs, such as [[?WEBAUDIO]], but this specification provides no direct access to channels.

A {{MediaStream}} object has an input and an output that represent the combined input and output of all the object's tracks. The output of the {{MediaStream}} controls how the object is rendered, e.g., what is saved if the object is recorded to a file or what is displayed if the object is used in a [^video^] element. A single {{MediaStream}} object can be attached to multiple different outputs at the same time.

A new {{MediaStream}} object can be created from existing media streams or tracks using the {{MediaStream/MediaStream()}} constructor. The constructor argument can either be an existing {{MediaStream}} object, in which case all the tracks of the given stream are added to the new {{MediaStream}} object, or an array of {{MediaStreamTrack}} objects. The latter form makes it possible to compose a stream from different source streams.

Both {{MediaStream}} and {{MediaStreamTrack}} objects can be cloned. A cloned {{MediaStream}} contains clones of all member tracks from the original stream. A cloned {{MediaStreamTrack}} has a set of constraints that is independent of the instance it is cloned from, which allows media from the same source to have different constraints applied for different consumers. The {{MediaStream}} object is also used in contexts outside {{MediaDevices/getUserMedia}}, such as [[?WEBRTC]].

The MediaStream constructor composes a new stream out of existing tracks. It takes an optional argument of type {{MediaStream}} or an array of {{MediaStreamTrack}} objects. When the constructor is invoked, the User Agent must run the following steps:

Let stream be a newly constructed {{MediaStream}} object.
Initialize stream.{{MediaStream/id}} attribute to a newly generated value.
If the constructor's argument is present, run the following steps:
1. Construct a set of tracks tracks based on the type of argument:
  - A {{MediaStream}} object:
    
    Let tracks be a set containing all the {{MediaStreamTrack}} objects in the {{MediaStream}} track set.
  - A sequence of {{MediaStreamTrack}} objects:
    
    Let tracks be a set containing all the {{MediaStreamTrack}} objects in the provided sequence.
2. For each {{MediaStreamTrack}}, track , in tracks, run the following steps:
  1. If track is already in stream's [=track set=], skip track.
  2. Otherwise, add track to stream's [=track set=].
Return stream.

The tracks of a {{MediaStream}} are stored in a track set. The track set MUST contain the {{MediaStreamTrack}} objects that correspond to the tracks of the stream. The relative order of the tracks in the set is User Agent defined and the API will never put any requirements on the order. The proper way to find a specific {{MediaStreamTrack}} object in the set is to look it up by its {{MediaStreamTrack/id}}.

An object that reads data from the output of a {{MediaStream}} is referred to as a {{MediaStream}} consumer. The list of {{MediaStream}} consumers currently include media elements (such as [^video^] and [^audio^]) [[HTML]], Web Real-Time Communications (WebRTC; {{RTCPeerConnection}}) [[?WEBRTC]], media recording (MediaRecorder) [[?mediastream-recording]], image capture (ImageCapture) [[?image-capture]], and web audio ({{MediaStreamAudioSourceNode}}) [[?WEBAUDIO]].

{{MediaStream}} consumers must be able to handle tracks being added and removed. This behavior is specified per consumer.

A {{MediaStream}} object is said to be active when it has at least one {{MediaStreamTrack}} that has not [=MediaStreamTrack/ended=]. A {{MediaStream}} that does not have any tracks or only has tracks that are [= MediaStreamTrack/ended =] is inactive.

A {{MediaStream}} object is said to be audible when it has at least one {{MediaStreamTrack}} whose {{MediaStreamTrack/[[Kind]]}} is "audio" that has not [=MediaStreamTrack/ended=]. A {{MediaStream}} that does not have any audio tracks or only has audio tracks that are [=MediaStreamTrack/ended=] is inaudible.

The [=User Agent=] may update a {{MediaStream}}'s [=track set=] in response to, for example, an external event. This specification does not specify any such cases, but other specifications using the MediaStream API may. One such example is the WebRTC 1.0 [[?WEBRTC]] specification where the [=track set=] of a {{MediaStream}}, received from another peer, can be updated as a result of changes to the media session.

To add a track track to a {{MediaStream}} stream, the [=User Agent=] MUST run the following steps:

If track is already in stream's [=track set=], then abort these steps.
Add track to stream's [=track set=].
[= Fire a track event=] named {{addtrack}} with track at stream.

To remove a track track from a {{MediaStream}} stream, the [=User Agent=] MUST run the following steps:

If track is not in stream's [=track set=], then abort these steps.
[=MediaStream/Remove a track|Remove=] track from stream's [=track set=].
[= Fire a track event =] named {{removetrack}} with track at stream.

[Exposed=Window]
interface MediaStream : EventTarget {
  constructor();
  constructor(MediaStream stream);
  constructor(sequence<MediaStreamTrack> tracks);
  readonly attribute DOMString id;
  sequence<MediaStreamTrack> getAudioTracks();
  sequence<MediaStreamTrack> getVideoTracks();
  sequence<MediaStreamTrack> getTracks();
  MediaStreamTrack? getTrackById(DOMString trackId);
  undefined addTrack(MediaStreamTrack track);
  undefined removeTrack(MediaStreamTrack track);
  MediaStream clone();
  readonly attribute boolean active;
  attribute EventHandler onaddtrack;
  attribute EventHandler onremovetrack;
};

Constructors

{{MediaStream}}: See the MediaStream constructor algorithm

No parameters.
{{MediaStream}}: See the MediaStream constructor algorithm
{{MediaStream}}: See the MediaStream constructor algorithm

Attributes

{{id}} of type {{DOMString}}, readonly

The id attribute MUST return the value to which it was initialized when the object was created.

When a {{MediaStream}} is created, the User Agent MUST generate an identifier string, and MUST initialize the object's {{id}} attribute to that string, unless the object is created as part of a special purpose algorithm that specifies how the stream id must be initialized. A good practice is to use a UUID [[rfc4122]], which is 36 characters long in its canonical form. To avoid fingerprinting, implementations SHOULD use the forms in section 4.4 or 4.5 of RFC 4122 when generating UUIDs.

An example of an algorithm that specifies how the stream id must be initialized is the algorithm to associate an incoming network component with a {{MediaStream}} object. [[?WEBRTC]]

active of type {{boolean}}, readonly

The {{active}} attribute MUST return true if this {{MediaStream}} is [= stream/active =] and false otherwise.

onaddtrack of type {{EventHandler}}

The event type of this event handler is {{addtrack}}.

onremovetrack of type {{EventHandler}}

The event type of this event handler is {{removetrack}}.

Methods

getAudioTracks()

Returns a sequence of {{MediaStreamTrack}} objects representing the audio tracks in this stream.

The {{getAudioTracks}} method MUST return a sequence that represents a snapshot of all the {{MediaStreamTrack}} objects in this stream's [=track set=] whose {{MediaStreamTrack/[[Kind]]}} is equal to "audio". The conversion from the [=track set=] to the sequence is [=User Agent=] defined and the order does not have to be stable between calls.

getVideoTracks()

Returns a sequence of {{MediaStreamTrack}} objects representing the video tracks in this stream.

The {{getVideoTracks}} method MUST return a sequence that represents a snapshot of all the {{MediaStreamTrack}} objects in this stream's [=track set=] whose {{MediaStreamTrack/[[Kind]]}} is equal to "video". The conversion from the [=track set=] to the sequence is [=User Agent=] defined and the order does not have to be stable between calls.

getTracks()

Returns a sequence of {{MediaStreamTrack}} objects representing all the tracks in this stream.

The {{getTracks}} method MUST return a sequence that represents a snapshot of all the {{MediaStreamTrack}} objects in this stream's [=track set=], regardless of {{MediaStreamTrack/[[Kind]]}}. The conversion from the [=track set=] to the sequence is User Agent defined and the order does not have to be stable between calls.

getTrackById()

The {{getTrackById}} method MUST return either a {{MediaStreamTrack}} object from this stream's [=track set=] whose {{MediaStreamTrack/[[Id]]}} is equal to trackId, or null, if no such track exists.

addTrack()

Adds the given {{MediaStreamTrack}} to this {{MediaStream}}.

When the {{addTrack}} method is invoked, the [=User Agent=] MUST run the following steps:

Let track be the methods argument and stream the {{MediaStream}} object on which the method was called.
If track is already in stream's [=track set=], then abort these steps.
[=MediaStream/Add a track|Add=] track to stream's [=track set=].

removeTrack()

Removes the given {{MediaStreamTrack}} object from this {{MediaStream}}.

When the {{removeTrack}} method is invoked, the [=User Agent=] MUST run the following steps:

Let track be the methods argument and stream the {{MediaStream}} object on which the method was called.
If track is not in stream's [=track set=], then abort these steps.
[=MediaStream/Remove a track|Remove=] track from stream's [=track set=].

clone()

Clones the given {{MediaStream}} and all its tracks.

When the {{clone()}} method is invoked, the User Agent MUST run the following steps:

Let streamClone be a newly constructed {{MediaStream}} object.
Initialize streamClone.{{MediaStream.id}} to a newly generated value.
Clone each track in this {{MediaStream}} object and add the result to streamClone's track set.
Return streamClone.

A {{MediaStreamTrack}} object represents a media source in the [=User Agent=]. An example source is a device connected to the [=User Agent=]. Other specifications may define sources for {{MediaStreamTrack}} that override the behavior specified here. Several {{MediaStreamTrack}} objects can represent the same media source, e.g., when the user chooses the same camera in the UI shown by two consecutive calls to {{MediaDevices/getUserMedia()}}.

A {{MediaStreamTrack}} source defines the following properties:

A source has a MediaStreamTrack source type. It is set to either {{MediaStreamTrack}} or a subtype of {{MediaStreamTrack}}. By default, it is set to {{MediaStreamTrack}}.
A source has MediaStreamTrack source-specific construction steps that are executed when creating a {{MediaStreamTrack}} from a source. The steps take a newly created {{MediaStreamTrack}} as input. By default, the steps are empty.
A source has MediaStreamTrack source-specific clone steps that are executed when cloning a {{MediaStreamTrack}} of the given source. The steps take the source and destination {{MediaStreamTrack}}s as input. By default, the steps are empty.

The data from a {{MediaStreamTrack}} object does not necessarily have a canonical binary form; for example, it could just be "the video currently coming from the user's video camera". This allows [=User Agents=] to manipulate media in whatever fashion is most suitable on the user's platform.

A script can indicate that a {{MediaStreamTrack}} object no longer needs its source with the {{MediaStreamTrack/stop()}} method. When all tracks using a source have been stopped or ended by some other means, the source is stopped. If the source is a device exposed by {{MediaDevices/getUserMedia()}}, then when the source is stopped, the [=User Agent=] MUST run the following steps:

Let mediaDevices be the {{MediaDevices}} object in question.
Let deviceId be the source device's {{MediaDeviceInfo/deviceId}}.
Set mediaDevices.{{MediaDevices/[[devicesLiveMap]]}}[deviceId] to false.
If the [=permission state=] of the permission associated with the device's kind and deviceId for mediaDevices's [=relevant settings object=], is not {{PermissionState/"granted"}}, then set mediaDevices.{{MediaDevices/[[devicesAccessibleMap]]}}[deviceId] to false.

To create a MediaStreamTrack with an underlying source, and a mediaDevicesToTieSourceTo, run the following steps:

Let track be a new object of type source's [=MediaStreamTrack source type=].

Initialize track with the following internal slots:
- [[\Source]], initialized to source.
- [[\Id]], initialized to a newly generated unique identifier string. See {{MediaStream.id}} attribute for guidelines on how to generate such an identifier.
- [[\Kind]], initialized to "audio" if source is an audio source, or "video" if source is a video source.
- [[\Label]], initialized to source's label, if provided by the User Agent, or "" otherwise. [=User Agents=] MAY label audio and video sources (e.g., "Internal microphone" or "External USB Webcam").
- [[\ReadyState]], initialized to {{MediaStreamTrackState/"live"}}.
- [[\Enabled]], initialized to true.
- [[\Muted]], initialized to true if source is [= source/muted =], and false otherwise.
- [[\Capabilities]], [[\Constraints]], and [[\Settings]], all initialized as specified in the {{ConstrainablePattern}}.
If mediaDevicesToTieSourceTo is not null, [=tie track source to `MediaDevices`=] with source and mediaDevicesToTieSourceTo.
Run source's [=MediaStreamTrack source-specific construction steps=] with track as parameter.
Return track.

To initialize the underlying source of track to source, run the following steps:

Initialize track.{{MediaStreamTrack/[[Source]]}} to source.
Initialize track's [[\Capabilities]], [[\Constraints]], and [[\Settings]], as specified in the {{ConstrainablePattern}}.

To tie track source to `MediaDevices`, given source and mediaDevices, run the following steps:

Add source to mediaDevices.{{MediaDevices/[[mediaStreamTrackSources]]}}.

To stop all sources of a [=global object=], named globalObject, the [=User Agent=] MUST run the following steps:

For each {{MediaStreamTrack}} object track whose relevant global object is globalObject, set track's {{MediaStreamTrack/[[ReadyState]]}} to {{MediaStreamTrackState/"ended"}}.
If globalObject is a {{Window}}, then for each source in globalObject's [=associated `MediaDevices`=].{{MediaDevices/[[mediaStreamTrackSources]]}}, [= source/stopped | stop =] source.

The [=User Agent=] MUST [=stop all sources=] of a globalObject in the following conditions:

If globalObject is a {{Window}} object and the [=unloading document cleanup steps=] are executed for its [=associated document=].
If globalObject is a {{WorkerGlobalScope}} object and its closing flag is set to true.

An implementation may use a per-source reference count to keep track of source usage, but the specifics are out of scope for this specification.

To clone a track the [=User Agent=] MUST run the following steps:

Let track be the {{MediaStreamTrack}} object to be cloned.
Let source be track's {{MediaStreamTrack/[[Source]]}}.
Let trackClone be the result of [=create a MediaStreamTrack | creating a MediaStreamTrack=] with source and null.
Set trackClone's {{MediaStreamTrack/[[ReadyState]]}} to track's {{MediaStreamTrack/[[ReadyState]]}} value.
Set trackClone's [[\Capabilities]] to a clone of track's [[\Capabilities]].
Set trackClone's [[\Constraints]] to a clone of track's [[\Constraints]].
Set trackClone's [[\Settings]] to a clone of track's [[\Settings]].
Run source [=MediaStreamTrack source-specific clone steps=] with track and trackClone as parameters.
Return trackClone.

Media Flow and Life-cycle

Media Flow

There are two dimensions related to the media flow for a {{MediaStreamTrackState/"live"}} {{MediaStreamTrack}} : muted / not muted, and enabled / disabled.

Muted refers to the input to the {{MediaStreamTrack}}. Live samples MUST NOT be made available to a {{MediaStreamTrack}} while it is [=MediaStreamTrack/muted=].

The [=MediaStreamTrack/muted=] state is outside the control of web applications, but can be observed by the application by reading the {{MediaStreamTrack/muted}} attribute and listening to the associated events {{mute}} and {{unmute}}. The reasons for a {{MediaStreamTrack}} to be muted are defined by its source.

For camera and microphone sources, the reasons to [=source/muted|mute=] are [=implementation-defined=]. This allows user agents to implement privacy mitigations in situations like: the user pushing a physical mute button on the microphone, the user closing a laptop lid with an embedded camera, the user toggling a control in the operating system, the user clicking a mute button in the [=User Agent=] chrome, the [=User Agent=] (on behalf of the user) mutes, etc.

On some operating systems, microphone access may get stolen from the [=User Agent=] when another application with higher-audio priority gets access to it, for instance in case of an incoming phone call on mobile OS. The [=User Agent=] SHOULD provide this information to the web application through {{MediaStreamTrack/muted}} and its associated events.

Whenever the [=User Agent=] initiates such an [= implementation-defined=] change for camera or microphone sources, it MUST queue a task, using the user interaction task source, to [=MediaStreamTrack/set a track's muted state=] to the state desired by the user.

This does not apply to [=source|sources=] defined in other specifications. Other specifications need to define their own steps to [=MediaStreamTrack/set a track's muted state=] if desired.

To set a track's muted state to newState, the [=User Agent=] MUST run the following steps:

Let track be the {{MediaStreamTrack}} in question.
If track.{{MediaStreamTrack/[[Muted]]}} is already newState, then abort these steps.
Set track.{{MediaStreamTrack/[[Muted]]}} to newState.
If newState is true let eventName be {{mute}}, otherwise {{unmute}}.
[=Fire an event=] named eventName on track.

Enabled/disabled on the other hand is available to the application to control (and observe) via the {{MediaStreamTrack/enabled}} attribute.

The result for the consumer is the same in the sense that whenever {{MediaStreamTrack}} is muted or disabled (or both) the consumer gets zero-information-content, which means silence for audio and black frames for video. In other words, media from the source only flows when a {{MediaStreamTrack}} object is both unmuted and enabled. For example, a video element sourced by a muted or disabled {{MediaStreamTrack}} (contained in a {{MediaStream}} ), is playing but rendering blackness.

For a newly created {{MediaStreamTrack}} object, the following applies: the track is always enabled unless stated otherwise (for example when cloned) and the muted state reflects the state of the source at the time the track is created.

Life-cycle

A {{MediaStreamTrack}} has two states in its life-cycle: live and ended. A newly created {{MediaStreamTrack}} can be in either state depending on how it was created. For example, cloning an ended track results in a new ended track. The current state is reflected by the object's {{MediaStreamTrack/readyState}} attribute.

In the live state, the track is active and media is available for use by consumers (but may be replaced by zero-information-content if the {{MediaStreamTrack}} is [= MediaStreamTrack/muted =] or [= MediaStreamTrack/enabled | disabled =], see below).

A muted or disabled {{MediaStreamTrack}} renders either silence (audio), black frames (video), or a zero-information-content equivalent. For example, a video element sourced by a muted or disabled {{MediaStreamTrack}} (contained within a {{MediaStream}} ), is playing but the rendered content is the muted output.

If the source is a device exposed by `navigator.mediaDevices.`{{MediaDevices/getUserMedia()}}, then when a track becomes either muted or disabled, and this brings all tracks connected to the device to be either muted, disabled, or stopped, then the UA MAY, using the device's {{MediaDeviceInfo/deviceId}}, deviceId, set `navigator.mediaDevices.`{{MediaDevices/[[devicesLiveMap]]}}[deviceId] to false, provided the UA sets it back to true as soon as any unstopped track connected to this device becomes un-muted or enabled again.

When a {{MediaStreamTrackState/"live"}}, [= MediaStreamTrack/muted | unmuted =], and [= MediaStreamTrack/enabled =] track sourced by a device exposed by {{MediaDevices/getUserMedia()}} becomes either [= MediaStreamTrack/muted =] or [= MediaStreamTrack/enabled | disabled =], and this brings all tracks connected to the device (across all [=navigables=] the user agent operates) to be either muted, disabled, or stopped, then the UA SHOULD relinquish the device within 3 seconds while allowing time for a reasonably-observant user to become aware of the transition. The UA SHOULD attempt to reacquire the device as soon as any live track sourced by the device becomes both [= MediaStreamTrack/muted | unmuted =] and [= MediaStreamTrack/enabled =] again, provided that track's [=relevant global object=]'s [=associated `Document`=] [=Document/is in view=] at that time. If the document is not [=Document/is in view|in view=] at that time, the UA SHOULD instead queue a task to [=MediaStreamTrack/muted|mute=] the track, and not queue a task to [=MediaStreamTrack/muted|unmute=] it until the document comes [=Document/is in view|into view=]. If reacquiring the device fails, the UA MUST [= track ended by the User agent | end the track =] (The UA MAY end it earlier should it detect a device problem, like the device being physically removed).

The intent is to give users the assurance of privacy that having physical camera (and microphone) hardware lights off brings, by aligning physical and logical “privacy indicators”, at least while the current document is the sole user of a device.

While other applications and documents using the device simultaneously may interfere with this intent at times, they do not interfere with the rules laid forth.

The muted/unmuted state of a track reflects whether the source provides any media at this moment. The enabled/disabled state is under application control and determines whether the track outputs media (to its consumers). Hence, media from the source only flows when a {{MediaStreamTrack}} object is both unmuted and enabled.

A {{MediaStreamTrack}} is [= MediaStreamTrack/muted =] when the source is muted, i.e. temporarily unable to provide the track with data. A track can be muted by a user. Often this action is outside the control of the application. This could be as a result of the user hitting a hardware switch or toggling a control in the operating system / [=User Agent=] chrome. A track can also be muted by the [=User Agent=].

Applications are able to [= MediaStreamTrack/enabled | enable =] or disable a {{MediaStreamTrack}} to prevent it from rendering media from the source. A muted track will however, regardless of the enabled state, render silence and blackness. A disabled track is logically equivalent to a muted track, from a consumer point of view.

For a newly created {{MediaStreamTrack}} object, the following applies. The track is always enabled unless stated otherwise (for example when cloned) and the muted state reflects the state of the source at the time the track is created.

A {{MediaStreamTrack}} object is said to end when the source of the track is disconnected or exhausted.

If all {{MediaStreamTrack}}s that are using the same source are [= MediaStreamTrack/ended =], the source will be [= source/stopped =].

After the application has invoked the {{MediaStreamTrack/stop()}} method on a {{MediaStreamTrack}} object, or once the [=source=] of a {{MediaStreamTrack}} permanently ends production of live samples to its tracks, whichever is sooner, a {{MediaStreamTrack}} is said to be ended.

For camera and microphone sources, the reasons for a source to [=MediaStreamTrack/ended|end=] besides {{MediaStreamTrack/stop()}} are [=implementation-defined=] (e.g., because the user rescinds the permission for the page to use the local camera, or because the User Agent has instructed the track to end for any reason).

When a {{MediaStreamTrack}} track ends for any reason other than the {{MediaStreamTrack/stop()}} method being invoked, the [=User Agent=] MUST queue a task that runs the following steps:

If track's {{MediaStreamTrack/[[ReadyState]]}} has the value {{MediaStreamTrackState/"ended"}} already, then abort these steps.
Set track's {{MediaStreamTrack/[[ReadyState]]}} to {{MediaStreamTrackState/"ended"}}.
Notify track's {{MediaStreamTrack/[[Source]]}} that track is [= MediaStreamTrack/ended =] so that the source may be [= source/stopped =], unless other {{MediaStreamTrack}} objects depend on it.
[=Fire an event=] named ended at the object.

If the end of the track was reached due to a user request, the event source for this event is the user interaction event source.

To invoke the device permission revocation algorithm with permissionName, run the following steps:

Let tracks be the set of all currently {{MediaStreamTrackState/"live"}} MediaStreamTracks whose permission associated with this kind of track ("camera" or "microphone") matches permissionName.
For each track in tracks, end the track.

Tracks and Constraints

{{MediaStreamTrack}} is a constrainable object as defined in the Constrainable Pattern section. Constraints are set on tracks and may affect sources.

Whether Constraints were provided at track initialization time or need to be established later at runtime, the APIs defined in the ConstrainablePattern Interface allow the retrieval and manipulation of the constraints currently established on a track.

Once ended, a track will continue exposing a list of inherent constrainable track properties. This list contains deviceId, facingMode and groupId.

Interface Definition

[Exposed=Window]
interface MediaStreamTrack : EventTarget {
  readonly attribute DOMString kind;
  readonly attribute DOMString id;
  readonly attribute DOMString label;
  attribute boolean enabled;
  readonly attribute boolean muted;
  attribute EventHandler onmute;
  attribute EventHandler onunmute;
  readonly attribute MediaStreamTrackState readyState;
  attribute EventHandler onended;
  MediaStreamTrack clone();
  undefined stop();
  MediaTrackCapabilities getCapabilities();
  MediaTrackConstraints getConstraints();
  MediaTrackSettings getSettings();
  Promise<undefined> applyConstraints(optional MediaTrackConstraints constraints = {});
};

Attributes

{{kind}} of type {{DOMString}}, readonly

The kind attribute MUST return [=this=].{{MediaStreamTrack/[[Kind]]}}.

{{id}} of type {{DOMString}}, readonly

The id attribute MUST return [=this=].{{MediaStreamTrack/[[Id]]}}.

{{label}} of type {{DOMString}}, readonly

The label attribute MUST return [=this=].{{MediaStreamTrack/[[Label]]}}.

{{enabled}} of type {{boolean}}

The enabled attribute controls the [= MediaStreamTrack/enabled =] state for the object.

On getting, [=this=].{{MediaStreamTrack/[[Enabled]]}} MUST be returned. On setting, [=this=].{{MediaStreamTrack/[[Enabled]]}} MUST be set to the new value.

Thus, after a {{MediaStreamTrack}} has [= MediaStreamTrack/ended =], its {{MediaStreamTrack/enabled}} attribute still changes value when set; it just doesn't do anything with that new value.

{{muted}} of type {{boolean}}, readonly

The muted attribute reflects whether the track is [= MediaStreamTrack/muted =]. It MUST return [=this=].{{MediaStreamTrack/[[Muted]]}}.

onmute of type {{EventHandler}}

The event type of this event handler is mute.

onunmute of type {{EventHandler}}

The event type of this event handler is unmute.

{{readyState}} of type {{MediaStreamTrackState}}, readonly

On getting, the readyState attribute MUST return [=this=].{{MediaStreamTrack/[[ReadyState]]}}.

onended of type {{EventHandler}}

The event type of this event handler is ended.

Methods

clone

When the {{clone()}} method is invoked, the [=User Agent=] MUST return the result of [=clone a track=] with [=this=].

stop

When a {{MediaStreamTrack}} object's {{stop()}} method is invoked, the User Agent MUST run following steps:

Let track be the current {{MediaStreamTrack}} object.
If track's {{MediaStreamTrack/[[ReadyState]]}} is {{MediaStreamTrackState/"ended"}}, then abort these steps.
Notify track's source that track is [= MediaStreamTrack/ended =].

A source that is notified of a track ending will be [= source/stopped =], unless other {{MediaStreamTrack}} objects depend on it.
Set track's {{MediaStreamTrack/[[ReadyState]]}} to {{MediaStreamTrackState/"ended"}}.

getCapabilities

Returns the capabilites of the source that this {{MediaStreamTrack}}, the constrainable object, represents.

See ConstrainablePattern Interface for the definition of this method.

Since this method gives likely persistent, cross-origin information about the underlying device, it adds to the fingerprint surface of the device.

getConstraints

See ConstrainablePattern Interface for the definition of this method.

getSettings

When a {{MediaStreamTrack}} object's {{MediaStreamTrack.getSettings()}} method is invoked, the [=User Agent=] MUST run following steps:

Let track be the current {{MediaStreamTrack}} object.
If track's {{MediaStreamTrack/[[ReadyState]]}} is {{MediaStreamTrackState/"ended"}}, run the following sub steps:
1. Let settings be a new {{MediaTrackSettings}} dictionary.
2. For each property of the list of inherent constrainable track properties, add a corresponding property to settings if track had such property at the time it was ended, with the value at the time track was ended.
3. Return settings.
Return the current settings of the track as defined in ConstrainablePattern Interface.

applyConstraints

When a {{MediaStreamTrack}} object's {{applyConstraints()}} method is invoked, the User Agent MUST run following steps:

Let track be the current {{MediaStreamTrack}} object.
If track's {{MediaStreamTrack/[[ReadyState]]}} is {{MediaStreamTrackState/"ended"}}, run the following sub steps:
1. Let p be a new promise.
2. [= resolve =] p with undefined.
3. Return p.
Invoke and return the result of the applyConstraints template method where:
- In the SelectSettings algorithm,
  - object is the {{MediaStreamTrack}} on which this method was called, and
  - settings dictionary refers to a possible instance of the {{MediaTrackSettings}} dictionary. The [=User Agent=] MUST NOT include inherent unchangeable device properties as members unless they are in the list of inherent constrainable track properties, or otherwise include device properties that must not be exposed.
    Other specifications may define constrainable properties that at times must not be exposed.
- In step 3 of the ApplyConstraints algorithm, all changes listed are to be made to object, and
- In step 4 of the ApplyConstraints algorithm, the requirement on getConstraints() applies to the getConstraints() method of object.

enum MediaStreamTrackState {
  "live",
  "ended"
};

MediaStreamTrackState Enumeration description
Enum value	Description
live	The track is active (the track's underlying media source is making a best-effort attempt to provide data in real time). The output of a track in the {{MediaStreamTrackState/"live"}} state can be switched on and off with the {{MediaStreamTrack/enabled}} attribute.
ended	The track has [= MediaStreamTrack/ended =] (the track's underlying media source is no longer providing data, and will never provide more data for this track). Once a track enters this state, it never exits it. For example, a video track in a {{MediaStream}} ends when the user unplugs the USB web camera that acts as the track's media source.

MediaTrackSupportedConstraints

{{MediaTrackSupportedConstraints}} represents the list of constraints recognized by a [=User Agent=] for controlling the Capabilities of a {{MediaStreamTrack}} object. This dictionary is used as a function return value, and never as an operation argument.

Future specifications can extend the {{MediaTrackSupportedConstraints}} dictionary by defining a partial dictionary with dictionary members of type {{boolean}}.

The constraints specified in this specification apply only to instances of {{MediaStreamTrack}} generated by {{MediaDevices.getUserMedia()}}, unless stated otherwise in other specifications.

dictionary MediaTrackSupportedConstraints {
  boolean width = true;
  boolean height = true;
  boolean aspectRatio = true;
  boolean frameRate = true;
  boolean facingMode = true;
  boolean resizeMode = true;
  boolean sampleRate = true;
  boolean sampleSize = true;
  boolean echoCancellation = true;
  boolean autoGainControl = true;
  boolean noiseSuppression = true;
  boolean latency = true;
  boolean channelCount = true;
  boolean deviceId = true;
  boolean groupId = true;
};

Dictionary {{MediaTrackSupportedConstraints}} Members

width of type {{boolean}}, defaulting to true: See width for details.
height of type {{boolean}}, defaulting to true: See height for details.
aspectRatio of type {{boolean}}, defaulting to true: See aspectRatio for details.
frameRate of type {{boolean}}, defaulting to true: See frameRate for details.
facingMode of type {{boolean}}, defaulting to true: See facingMode for details.
resizeMode of type {{boolean}}, defaulting to true: See resizeMode for details.
sampleRate of type {{boolean}}, defaulting to true: See sampleRate for details.
sampleSize of type {{boolean}}, defaulting to true: See sampleSize for details.
echoCancellation of type {{boolean}}, defaulting to true: See echoCancellation for details.
autoGainControl of type {{boolean}}, defaulting to true: See autoGainControl for details.
noiseSuppression of type {{boolean}}, defaulting to true: See noiseSuppression for details.
latency of type {{boolean}}, defaulting to true: See latency for details.
channelCount of type {{boolean}}, defaulting to true: See channelCount for details.
deviceId of type {{boolean}}, defaulting to true: See deviceId for details.
groupId of type {{boolean}}, defaulting to true: See groupId for details.

MediaTrackCapabilities

{{MediaTrackCapabilities}} represents the Capabilities of a {{MediaStreamTrack}} object.

Future specifications can extend the MediaTrackCapabilities dictionary by defining a partial dictionary with dictionary members of appropriate type.

dictionary MediaTrackCapabilities {
  ULongRange width;
  ULongRange height;
  DoubleRange aspectRatio;
  DoubleRange frameRate;
  sequence<DOMString> facingMode;
  sequence<DOMString> resizeMode;
  ULongRange sampleRate;
  ULongRange sampleSize;
  sequence<boolean> echoCancellation;
  sequence<boolean> autoGainControl;
  sequence<boolean> noiseSuppression;
  DoubleRange latency;
  ULongRange channelCount;
  DOMString deviceId;
  DOMString groupId;
};

For historical reasons, {{MediaTrackCapabilities/deviceId}} and {{MediaTrackCapabilities/groupId}} are {{DOMString}} instead of the `sequence<DOMString>` expected by {{Capabilities}} in the ConstrainablePattern.

Dictionary {{MediaTrackCapabilities}} Members

width of type {{ULongRange}}: See width for details.
height of type {{ULongRange}}: See height for details.
aspectRatio of type {{DoubleRange}}: See aspectRatio for details.
frameRate of type {{DoubleRange}}: See frameRate for details.
facingMode of type sequence<{{DOMString}}>: A camera can report multiple facing modes. For example, in a high-end telepresence solution with several cameras facing the user, a camera to the left of the user can report both {{VideoFacingModeEnum/"left"}} and {{VideoFacingModeEnum/"user"}}. See facingMode for additional details.
resizeMode of type sequence<{{DOMString}}>: The [=User Agent=] MAY use cropping and downscaling to offer more resolution choices than this camera naturally produces. The reported sequence MUST list all the means the UA may employ to derive resolution choices for this camera. The value {{VideoResizeModeEnum/"none"}} MUST be present, indicating the ability to constrain the UA from cropping and downscaling. See resizeMode for additional details.
sampleRate of type {{ULongRange}}: See sampleRate for details.
sampleSize of type {{ULongRange}}: See sampleSize for details.
echoCancellation of type sequence<{{boolean}}>: If the source cannot do echo cancellation a single false is reported. If echo cancellation cannot be turned off, a single true is reported. If the script can control the feature, the source reports a list with both true and false as possible values. See echoCancellation for additional details.
autoGainControl of type sequence<{{boolean}}>: If the source cannot do auto gain control a single false is reported. If auto gain control cannot be turned off, a single true is reported. If the script can control the feature, the source reports a list with both true and false as possible values. See autoGainControl for additional details.
noiseSuppression of type sequence<{{boolean}}>: If the source cannot do noise suppression a single false is reported. If noise suppression cannot be turned off, a single true is reported. If the script can control the feature, the source reports a list with both true and false as possible values. See noiseSuppression for additional details.
latency of type {{DoubleRange}}: See latency for details.
channelCount of type {{ULongRange}}: See channelCount for details.
deviceId of type {{DOMString}}: See deviceId for details.
groupId of type {{DOMString}}: See groupId for details.

MediaTrackConstraints

dictionary MediaTrackConstraints : MediaTrackConstraintSet {
  sequence<MediaTrackConstraintSet> advanced;
};

Dictionary {{MediaTrackConstraints}} Members

advanced of type sequence<{{MediaTrackConstraintSet}}>: See Constraints and ConstraintSet for the definition of this element.

Future specifications can extend the MediaTrackConstraintSet dictionary by defining a partial dictionary with dictionary members of appropriate type.

dictionary MediaTrackConstraintSet {
  ConstrainULong width;
  ConstrainULong height;
  ConstrainDouble aspectRatio;
  ConstrainDouble frameRate;
  ConstrainDOMString facingMode;
  ConstrainDOMString resizeMode;
  ConstrainULong sampleRate;
  ConstrainULong sampleSize;
  ConstrainBoolean echoCancellation;
  ConstrainBoolean autoGainControl;
  ConstrainBoolean noiseSuppression;
  ConstrainDouble latency;
  ConstrainULong channelCount;
  ConstrainDOMString deviceId;
  ConstrainDOMString groupId;
};

Dictionary {{MediaTrackConstraintSet}} Members

width of type {{ConstrainULong}}: See width for details.
height of type {{ConstrainULong}}: See height for details.
aspectRatio of type {{ConstrainDouble}}: See aspectRatio for details.
frameRate of type {{ConstrainDouble}}: See frameRate for details.
facingMode of type {{ConstrainDOMString}}: See facingMode for details.
resizeMode of type {{ConstrainDOMString}}: See resizeMode for details.
sampleRate of type {{ConstrainULong}}: See sampleRate for details.
sampleSize of type {{ConstrainULong}}: See sampleSize for details.
echoCancellation of type {{ConstrainBoolean}}: See echoCancellation for details.
autoGainControl of type {{ConstrainBoolean}}: See autoGainControl for details.
noiseSuppression of type {{ConstrainBoolean}}: See noiseSuppression for details.
latency of type {{ConstrainDouble}}: See latency for details.
channelCount of type {{ConstrainULong}}: See channelCount for details.
deviceId of type {{ConstrainDOMString}}: See deviceId for details.
groupId of type {{ConstrainDOMString}}: See groupId for details.

MediaTrackSettings

{{MediaTrackSettings}} represents the Settings of a {{MediaStreamTrack}} object.

Future specifications can extend the MediaTrackSettings dictionary by defining a partial dictionary with dictionary members of appropriate type.

dictionary MediaTrackSettings {
  unsigned long width;
  unsigned long height;
  double aspectRatio;
  double frameRate;
  DOMString facingMode;
  DOMString resizeMode;
  unsigned long sampleRate;
  unsigned long sampleSize;
  boolean echoCancellation;
  boolean autoGainControl;
  boolean noiseSuppression;
  double latency;
  unsigned long channelCount;
  DOMString deviceId;
  DOMString groupId;
};

Dictionary {{MediaTrackSettings}} Members

width of type {{unsigned long}}: See width for details.
height of type {{unsigned long}}: See height for details.
aspectRatio of type {{double}}: See aspectRatio for details.
frameRate of type {{double}}: See frameRate for details.
facingMode of type {{DOMString}}: See facingMode for details.
resizeMode of type {{DOMString}}: See resizeMode for details.
sampleRate of type {{unsigned long}}: See sampleRate for details.
sampleSize of type {{unsigned long}}: See sampleSize for details.
echoCancellation of type {{boolean}}: See echoCancellation for details.
autoGainControl of type {{boolean}}: See autoGainControl for details.
noiseSuppression of type {{boolean}}: See noiseSuppression for details.
latency of type {{double}}: See latency for details.
channelCount of type {{unsigned long}}: See channelCount for details.
deviceId of type {{DOMString}}: See deviceId for details.
groupId of type {{DOMString}}: See groupId for details.

Constrainable Properties

The names of the initial set of constrainable properties for MediaStreamTrack are defined below.

The following constrainable properties are defined to apply to both video and audio {{MediaStreamTrack}} objects:

Property Name	Type	Notes
deviceId	{{DOMString}}	The identifier of the device generating the content of the {{MediaStreamTrack}}. It conforms with the definition of {{MediaDeviceInfo.deviceId}}. Note that the setting of this property is uniquely determined by the source that is attached to the {{MediaStreamTrack}}. In particular, {{MediaStreamTrack/getCapabilities()}} will return only a single value for deviceId. This property can therefore be used for initial media selection with {{MediaDevices/getUserMedia()}}. However, it is not useful for subsequent media control with {{MediaStreamTrack/applyConstraints()}}, since any attempt to set a different value will result in an unsatisfiable ConstraintSet. If a string of length 0 is used as a deviceId value constraint with {{MediaDevices/getUserMedia()}}, it MAY be interpreted as if the constraint is not specified.
groupId	{{DOMString}}	The [=document=]-unique group identifier for the device generating the content of the {{MediaStreamTrack}}. It conforms with the definition of {{MediaDeviceInfo.groupId}}. Note that the setting of this property is uniquely determined by the source that is attached to the {{MediaStreamTrack}}. In particular, {{MediaStreamTrack/getCapabilities()}} will return only a single value for groupId. Since this property is not stable between browsing sessions, its usefulness for initial media selection with {{MediaDevices/getUserMedia()}} is limited. It is not useful for subsequent media control with {{MediaStreamTrack/applyConstraints()}}, since any attempt to set a different value will result in an unsatisfiable ConstraintSet.

The following constrainable properties are defined to apply only to video {{MediaStreamTrack}} objects:

Property Name	Type	Notes
width	{{unsigned long}}	The width, in pixels. As a capability, its valid range should span the video source's pre-set width values with min being equal to 1 and max being the largest width. The [=User Agent=] MUST support downsampling to any value between the min width range value and the native resolution width.
height	{{unsigned long}}	The height, in pixels. As a capability, its valid range should span the video source's pre-set height values with min being equal to 1 and max being the largest height. The [=User Agent=] MUST support downsampling to any value between the min height range value and the native resolution height.
frameRate	{{double}}	The frame rate (frames per second). If video source's pre-set can determine frame rates, then, as a capability, its valid range should span the video source's pre-set frame rate values with min being equal to 0 and max being the largest frame rate. The [=User Agent=] MUST support frame rates obtained from integral decimation of the native resolution frame rate. If frame rate cannot be determined (e.g. the source does not natively provide a frame rate, or the frame rate cannot be determined from the source stream), then the capability values MUST refer to the [=User Agent=]'s vsync display rate. As a setting, this value represents the configured frame rate. If decimation is used, this is that value rather than the native frame rate. For example, if the setting is 25 frames per second via decimation, the native frame rate of the camera is 30 frames per second but due to lighting conditions only 20 frames per second is achieved, {{frameRate}} reports the setting: 25 frames per second.
aspectRatio	{{double}}	The exact aspect ratio (width in pixels divided by height in pixels, represented as a double rounded to the tenth decimal place) or aspect ratio range.
facingMode	{{DOMString}}	This string is one of the members of {{VideoFacingModeEnum}}. The members describe the directions that the camera can face, as seen from the user's perspective. Note that `getConstraints` may not return exactly the same string for strings not in this enum. This preserves the possibility of using a future version of WebIDL enum for this property.
resizeMode	{{DOMString}}	This string is one of the members of {{VideoResizeModeEnum}}. The members describe the means by which the resolution can be derived by the UA. In other words, whether the UA is allowed to use cropping and downscaling on the camera output. The UA MAY disguise concurrent use of the camera, by downscaling, upscaling, and/or cropping to mimic native resolutions when "none" is used, but only when the camera is in use in another application outside the [=User Agent=]. Note that `getConstraints` may not return exactly the same string for strings not in this enum. This preserves the possibility of using a future version of WebIDL enum for this property.

On systems where it's desirable to sometimes automatically flip the X and Y axis of the resulting captured video in response to ongoing environmental factors, the {{width}}, {{height}} and {{aspectRatio}} constraints and capabilities MUST remain unaffected in all algorithms and be considered in the primary orientation only, except for the {{MediaStreamTrack/getSettings()}} algorithm where settings for these constrainable properties MUST be flipped if necessary to match the returned dimensions of the captured video at any point in time.

The primary orientation of a system that supports flipping the X and Y axis of resulting captured video is defined by the User Agent for the particular system.

On systems that support automatic switching between landscape and portrait mode, [=User Agents=] are encouraged to make landscape mode the primary orientation.

enum VideoFacingModeEnum {
  "user",
  "environment",
  "left",
  "right"
};

VideoFacingModeEnum Enumeration description
Enum value	Description
`user`	The source is facing toward the user (a self-view camera).
`environment`	The source is facing away from the user (viewing the environment).
`left`	The source is facing to the left of the user.
`right`	The source is facing to the right of the user.

Below is an illustration of the video facing modes in relation to the user.
Illustration of video facing modes in relation to user

enum VideoResizeModeEnum {
  "none",
  "crop-and-scale"
};

VideoResizeModeEnum Enumeration description
Enum value	Description
none	This resolution and frame rate is offered by the camera, its driver, or the OS. Note: The UA MAY report this value to disguise concurrent use, but only when the camera is in use in another [=navigable=].
crop-and-scale	This resolution is downscaled and/or cropped from a higher camera resolution by the [=User Agent=], or its frame rate is decimated by the [=User Agent=]. The media MUST NOT be upscaled, stretched or have fake data created that did not occur in the input source, except as noted below. Note: The UA MAY upscale to disguise concurrent use, but only when the camera is in use in another application outside the [=User Agent=].

The following constrainable properties are defined to apply only to audio {{MediaStreamTrack}} objects:

Property Name	Values	Notes
sampleRate	{{unsigned long}}	The sample rate in samples per second for the audio data.
sampleSize	{{unsigned long}}	The linear sample size in bits. As a constraint, it can only be satisfied for audio devices that produce linear samples.
echoCancellation	{{boolean}}	When one or more audio streams is being played in the processes of various microphones, it is often desirable to attempt to remove all the sound being played from the input signals recorded by the microphones. This is referred to as echo cancellation. There are cases where it is not needed and it is desirable to turn it off so that no audio artifacts are introduced. This allows applications to control this behavior.
autoGainControl	{{boolean}}	Automatic gain control is often desirable on the input signal recorded by the microphone. There are cases where it is not needed and it is desirable to turn it off so that the audio is not altered. This allows applications to control this behavior.
noiseSuppression	{{boolean}}	Noise suppression is often desirable on the input signal recorded by the microphone. There are cases where it is not needed and it is desirable to turn it off so that the audio is not altered. This allows applications to control this behavior.
latency	{{double}}	The latency or latency range, in seconds. The latency is the time between start of processing (for instance, when sound occurs in the real world) to the data being available to the next step in the process. Low latency is critical for some applications; high latency may be acceptable for other applications because it helps with power constraints. The number is expected to be the target latency of the configuration; the actual latency may show some variation from that.
channelCount	{{unsigned long}}	The number of independent channels of sound that the audio data contains, i.e. the number of audio samples per sample frame.

The {{addtrack}} and {{removetrack}} events use the {{MediaStreamTrackEvent}} interface.

The {{addtrack}} and {{removetrack}} events notify the script that the [=track set=] of a {{MediaStream}} has been updated by the [=User Agent=].

Firing a track event named e with a {{MediaStreamTrack}} track means that an event with the name e, which does not bubble (except where otherwise stated) and is not cancelable (except where otherwise stated), and which uses the {{MediaStreamTrackEvent}} interface with the {{MediaStreamTrackEvent/track}} attribute set to track, MUST be created and dispatched at the given target.

[Exposed=Window]
interface MediaStreamTrackEvent : Event {
  constructor(DOMString type, MediaStreamTrackEventInit eventInitDict);
  [SameObject] readonly attribute MediaStreamTrack track;
};

Constructors

constructor(): Constructs a new {{MediaStreamTrackEvent}}.

Attributes

{{track}} of type {{MediaStreamTrack}}, readonly: The track attribute represents the {{MediaStreamTrack}} object associated with the event.

dictionary MediaStreamTrackEventInit : EventInit {
  required MediaStreamTrack track;
};

Dictionary MediaStreamTrackEventInit Members

track of type {{MediaStreamTrack}}, required

Attribute Name	Attribute Type	Setter/Getter Behavior When Provider is a MediaStream	Additional considerations
{{HTMLMediaElement/preload}}	{{DOMString}}	On getting: `none`. On setting: ignored.	A {{MediaStream}} cannot be preloaded.
{{HTMLMediaElement/buffered}}	{{TimeRanges}}	`buffered.length` MUST return `0`.	A {{MediaStream}} cannot be preloaded. Therefore, the amount buffered is always an empty time range.
{{HTMLMediaElement/currentTime}}	{{double}}	Any non-negative integer. The initial value is `0` and the values increments linearly in real time whenever the stream is playing.	The value is the official playback position, in seconds. Any attempt to alter it MUST be ignored.
{{HTMLMediaElement/seeking}}	{{boolean}}	`false`	A {{MediaStream}} is not seekable. Therefore, this attribute MUST always return the value `false`.
{{HTMLMediaElement/defaultPlaybackRate}}	{{double}}	On getting: `1.0`. On setting: ignored.	A {{MediaStream}} is not seekable. Therefore, this attribute MUST always return the value `1.0` and any attempt to alter it MUST be ignored. Note that this also means that the `ratechange` event will not fire.
{{HTMLMediaElement/playbackRate}}	{{double}}	On getting: `1.0`. On setting: ignored.	A {{MediaStream}} is not seekable. Therefore, this attribute MUST always return the value `1.0` and any attempt to alter it MUST be ignored. Note that this also means that the `ratechange` event will not fire.
{{HTMLMediaElement/played}}	{{TimeRanges}}	`played.length` MUST return `1`. `played.start(0)` MUST return `0`. `played.end(0)` MUST return the last known {{HTMLMediaElement/currentTime}}.	A {{MediaStream}}'s timeline always consists of a single range, starting at 0 and extending up to the currentTime.
{{HTMLMediaElement/seekable}}	{{TimeRanges}}	`seekable.length` MUST return `0`.	A {{MediaStream}} is not seekable.
{{HTMLMediaElement/loop}}	{{boolean}}	`true`, `false`	Setting the {{HTMLMediaElement/loop}} attribute has no effect since a {{MediaStream}} has no defined end and therefore cannot be looped.

Event name	Interface	Fired when...
addtrack	{{MediaStreamTrackEvent}}	A new {{MediaStreamTrack}} has been added to this stream. Note that this event is not fired when the script directly modifies the tracks of a {{MediaStream}}.
removetrack	{{MediaStreamTrackEvent}}	A {{MediaStreamTrack}} has been removed from this stream. Note that this event is not fired when the script directly modifies the tracks of a {{MediaStream}}.

Event name	Interface	Fired when...
mute	{{Event}}	The {{MediaStreamTrack}} object's source is temporarily unable to provide data.
unmute	{{Event}}	The {{MediaStreamTrack}} object's source is live again after having been temporarily unable to provide data.
ended	{{Event}}	The {{MediaStreamTrack}} object's source will no longer provide any data, either because the user revoked the permissions, or because the source device has been ejected, or because the remote peer permanently stopped sending data.

Event name	Interface	Fired when...
devicechange	{{DeviceChangeEvent}}	The set of media devices, available to the [=User Agent=], has changed. The current list of devices is available in the {{DeviceChangeEvent/devices}} attribute.

Enumerating Local Media Devices

This section describes an API that the script can use to query the User Agent about connected media input and output devices (for example a web camera or a headset).

`Navigator` Interface Extensions

Each {{Window}} has an associated `MediaDevices`, which is a {{MediaDevices}} object. Upon creation of the {{Window}} object, its [=associated `MediaDevices`=] MUST be set to a newly [=create a MediaDevices | created MediaDevices=] object with the {{Window}} object's [=relevant realm=].

partial interface Navigator {
  [SameObject, SecureContext] readonly attribute MediaDevices mediaDevices;
};

Attributes

mediaDevices of type {{MediaDevices}}, readonly: Return [=this=]'s [=relevant global object=]'s [=associated `MediaDevices`=].

The MediaDevices object is the entry point to the API used to examine and get access to media devices available to the [=User Agent=].

To create a MediaDevices object, given realm, run the following steps:

Let mediaDevices be a new {{MediaDevices}} object in realm, initalized with the following internal slots:
- [[\devicesLiveMap]], initialized to an empty [=ordered map | map=].
- [[\devicesAccessibleMap]], initialized to an empty [=ordered map | map=].
- [[\kindsAccessibleMap]], initialized to an empty [=ordered map | map=].
- [[\storedDeviceList]], initialized to a [=list=] of all media input and output devices available to the [=User Agent=].
- [[\canExposeCameraInfo]], initialized to false.
- [[\canExposeMicrophoneInfo]], initialized to false.
- [[\mediaStreamTrackSources]], initialized to an empty [=set=].
Let settings be mediaDevices's [=relevant settings object=].
For each kind of device, kind, that {{MediaDevices.getUserMedia()}} exposes, run the following step:
1. Set mediaDevices.{{MediaDevices/[[kindsAccessibleMap]]}}[kind] to either true if the [=permission state=] of the permission associated with kind (e.g. "camera", "microphone") for settings is {{PermissionState/"granted"}}, or to false otherwise.
For each individual device that {{MediaDevices.getUserMedia()}} exposes, using the device's deviceId, deviceId, run the following step:
1. Set mediaDevices.{{MediaDevices/[[devicesLiveMap]]}}[deviceId] to false, and set mediaDevices.{{MediaDevices/[[devicesAccessibleMap]]}}[deviceId] to either true if the [=permission state=] of the permission associated with the device’s kind and deviceId for settings, is {{PermissionState/"granted"}}, or to false otherwise.
Return mediaDevices.

For each kind of device, kind, that {{MediaDevices/getUserMedia()}} exposes, [=permission state|whenever a transition occurs of the permission state=] of the permission associated with kind for mediaDevices's [=relevant settings object=], run the following steps:

If the transition is to {{PermissionState/"granted"}} from another value, then set mediaDevices.{{MediaDevices/[[kindsAccessibleMap]]}}[kind] to true.
If the transition is from {{PermissionState/"granted"}} to another value, then set mediaDevices.{{MediaDevices/[[kindsAccessibleMap]]}}[kind] to false.

For each device that {{MediaDevices/getUserMedia()}} exposes, whenever a transition occurs of the [=permission state=] of the permission associated with the device's kind and the device's deviceId, deviceId, for mediaDevices's [=relevant settings object=], run the following steps:

If the transition is to {{PermissionState/"granted"}} from another value, then set mediaDevices.{{MediaDevices/[[devicesAccessibleMap]]}}[deviceId] to true, if it isn’t already true.
If the transition is from {{PermissionState/"granted"}} to another value, and the device is currently [= source/stopped =], then set mediaDevices.{{MediaDevices/[[devicesAccessibleMap]]}}[deviceId] to false.

When new media input and/or output devices are made available to the [=User Agent=], or any available input and/or output device becomes unavailable, or the system default for input and/or output devices of a {{MediaDeviceKind}} changed, the [=User Agent=] MUST run the following device change notification steps for each {{MediaDevices}} object, mediaDevices, for which [=device enumeration can proceed=] is true, but for no other {{MediaDevices}} object:

Let lastExposedDevices be the result of [=creating a list of device info objects=] with mediaDevices and mediaDevices.{{MediaDevices/[[storedDeviceList]]}}.
Let deviceList be the list of all media input and/or output devices available to the [=User Agent=].
Let newExposedDevices be the result of [=creating a list of device info objects=] with mediaDevices and deviceList.
If the {{MediaDeviceInfo}} objects in newExposedDevices match those in lastExposedDevices and have the same order, then abort these steps.

Due to the {{MediaDevices/enumerateDevices}} algorithm, the above step limits firing the devicechange event to documents [=allowed to use=] {{MediaDevices/enumerateDevices}} to enumerate devices of a particular {{MediaDeviceKind}}.
Set mediaDevices.{{MediaDevices/[[storedDeviceList]]}} to deviceList.
Queue a task that [= fire an event | fires an event=] named {{devicechange}}, using the {{DeviceChangeEvent}} constructor with {{DeviceChangeEventInit/devices}} initialized to newExposedDevices, at mediaDevices.

The [=User Agent=] MAY combine firing multiple events into firing one event when several events are due or when multiple devices are added or removed at the same time, e.g. a camera with a microphone.

Additionally, if a {{MediaDevices}} object that was traversed comes to meet the [=device enumeration can proceed=] criteria later (e.g. [=Document/is in view | comes into view=]), the [=User Agent=] MUST execute the [=device change notification steps=] on the {{MediaDevices}} object at that time.

These events are potentially triggered simultaneously on documents of different origins. [=User Agents=] MAY add fuzzing on the timing of events to avoid cross-origin activity correlation.

[Exposed=Window, SecureContext]
interface MediaDevices : EventTarget {
  attribute EventHandler ondevicechange;
  Promise<sequence<MediaDeviceInfo>> enumerateDevices();
};

Attributes

ondevicechange of type {{EventHandler}}: The event type of this event handler is devicechange.

Methods

enumerateDevices

Collects information about the [=User Agent=]'s available media input and output devices.

This method returns a promise. The promise will be [=upon fulfillment|fulfilled=] with a sequence of {{MediaDeviceInfo}} objects representing the [=User Agent=]'s available media input and output devices if enumeration is successful.

Elements of this sequence that represent input devices will be of type {{InputDeviceInfo}} which extends {{MediaDeviceInfo}}.

Camera and microphone sources SHOULD be enumerable. Specifications that add additional types of source will provide recommendations about whether the source type should be enumerable.

When the {{MediaDevices/enumerateDevices()}} method is called, the [=User Agent=] must run the following steps:

Let p be a new promise.
Let proceed be the result of [=device enumeration can proceed=] with [=this=].
Let mediaDevices be [=this=].
Run the following steps in parallel:
1. While proceed is `false`, the [=User Agent=] MUST wait to proceed to the next step until a task queued to set proceed to the result of [=device enumeration can proceed=] with mediaDevices, would set proceed to `true`.
2. Let resultList be the result of [=creating a list of device info objects=] with mediaDevices and mediaDevices.{{MediaDevices/[[storedDeviceList]]}}.
3. [= resolve =] p with resultList.
Return p.

To perform creating a list of device info objects, given mediaDevices and deviceList, run the following steps:

Let resultList be an empty list.
Let microphoneList, cameraList and otherDeviceList be empty lists.
Let document be mediaDevices's [=relevant global object=]'s [=associated `Document`=].
Run the following sub steps for each discovered device in deviceList, device:
1. If device is not a microphone, or document is not [=allowed to use=] the feature identified by "microphone", abort these sub steps and continue with the next device (if any).
2. Let deviceInfo be the result of [=creating a device info object=] to represent device, with mediaDevices.
3. If device is the system default microphone, prepend deviceInfo to microphoneList. Otherwise, append deviceInfo to microphoneList.
Run the following sub steps for each discovered device in deviceList, device:
1. If device is not a camera, or document is not [=allowed to use=] the feature identified by "camera", abort these sub steps and continue with the next device (if any).
2. Let deviceInfo be the result of [=creating a device info object=] to represent device, with mediaDevices.
3. If device is the system default camera, prepend deviceInfo to cameraList. Otherwise, append deviceInfo to cameraList.
If [=microphone information can be exposed=] on mediaDevices is false, truncate microphoneList to its first item.
If [=camera information can be exposed=] on mediaDevices is false, truncate cameraList to its first item.
Run the following sub steps for each discovered device in deviceList, device:
1. If device is a microphone or device is a camera, abort these sub steps and continue with the next device (if any).
2. Run the [=exposure decision algorithm for devices other than camera and microphone=], with device, microphoneList, cameraList and mediaDevices as input. If the result of this algorithm is false, abort these sub steps and continue with the next device (if any).
3. Let deviceInfo be the result of [=creating a device info object=] to represent device, with mediaDevices.
4. If device is the system default audio output, prepend deviceInfo to otherDeviceList. Otherwise, append deviceInfo to otherDeviceList.
Append to resultList all devices of microphoneList in order.
Append to resultList all devices of cameraList in order.
Append to resultList all devices of otherDeviceList in order.
Return resultList.

Since this method returns persistent information across browsing sessions and origins via the availability of media capture devices, it adds to the fingerprinting surface exposed by the [=User Agent=].

As long as the [=relevant global object=]'s [=associated `Document`=] did not capture, this method will limit exposure to two bits of information: whether there is a camera and whether there is a microphone. A [=User Agent=] may mitigate this by pretending the system has a camera and a microphone, for instance until the [=relevant global object=]'s [=associated `Document`=] calls {{MediaDevices/getUserMedia()}} with constraints deemed reasonable.

After the [=relevant global object=]'s [=associated `Document`=] started capture, it provides additional persistent cross-origin information via the list of all media capture devices, including their grouping and human readable labels associated with the capture devices, which further adds to the fingerprinting surface.

A [=User Agent=] may limit exposure by sanitizing device labels. This could for instance mean removing user names found in labels, but keeping device manufacturer or model information. It is important that the sanitized labels allow users to identify the corresponding devices.

Access control model

The algorithm described above means that the access to media device information depends on whether or not the [=relevant global object=]'s [=associated `Document`=] did capture.

For camera and microphone devices, if the [=relevant global object=]'s [=associated `Document`=] did not capture (i.e. {{MediaDevices/getUserMedia()}} was not called or never resolved successfully), the {{MediaDeviceInfo}} object will contain a valid value for {{MediaDeviceInfo/kind}} but empty strings for {{MediaDeviceInfo/deviceId}}, {{MediaDeviceInfo/label}}, and {{MediaDeviceInfo/groupId}}. Additionally, at most one device of each {{MediaDeviceInfo/kind}} will be listed in {{MediaDevices/enumerateDevices()}} result.

Otherwise, the MediaDeviceInfo object will contain meaningful values for {{MediaDeviceInfo/deviceId}}, {{MediaDeviceInfo/kind}}, {{MediaDeviceInfo/label}}, and {{MediaDeviceInfo/groupId}}. All available devices are listed in {{MediaDevices/enumerateDevices()}} result.

To perform creating a device info object to represent a discovered device, device, given mediaDevices, run the following steps:

Let deviceInfo be a new {{MediaDeviceInfo}} object to represent device.
Initialize deviceInfo.{{MediaDeviceInfo/kind}} for device.
If deviceInfo.{{MediaDeviceInfo/kind}} is equal to "videoinput" and [=camera information can be exposed=] on mediaDevices is false, return deviceInfo.
If deviceInfo.{{MediaDeviceInfo/kind}} is equal to "audioinput" and [=microphone information can be exposed=] on mediaDevices is false, return deviceInfo.
Initialize deviceInfo.{{MediaDeviceInfo/label}} for device.
If a stored {{MediaDeviceInfo/deviceId}} exists for device, initialize deviceInfo.{{MediaDeviceInfo/deviceId}} to that value. Otherwise, let deviceInfo.{{MediaDeviceInfo/deviceId}} be a newly generated unique identifier as described under {{MediaDeviceInfo/deviceId}}.
If device belongs to the same physical device as a device already represented for document, initialize deviceInfo.{{MediaDeviceInfo/groupId}} to the {{MediaDeviceInfo/groupId}} value of the existing {{MediaDeviceInfo}} object. Otherwise, let deviceInfo.{{MediaDeviceInfo/groupId}} be a newly generated unique identifier as described under {{MediaDeviceInfo/groupId}}.
Return deviceInfo

Device information exposure

To perform a device enumeration can proceed check, given mediaDevices, run the following steps:

The [=User Agent=] MAY return true if [=device information can be exposed=] on mediaDevices.
Return the result of [=Document/is in view=] with mediaDevices.

To perform a device information can be exposed check, given mediaDevices, run the following steps:

If [=camera information can be exposed=] on mediaDevices, return true.
If [=microphone information can be exposed=] on mediaDevices, return true.
Return false.

To perform a camera information can be exposed check, given mediaDevices, run the following steps:

If any of the local devices of kind "videoinput" are attached to a live {{MediaStreamTrack}} in mediaDevices's [=relevant global object=]'s [=associated `Document`=], return true.
Return mediaDevices.{{MediaDevices/[[canExposeCameraInfo]]}}.

To perform a microphone information can be exposed check, given mediaDevices, run the following steps:

If any of the local devices of kind "audioinput" are attached to a live {{MediaStreamTrack}} in the [=relevant global object=]'s [=associated `Document`=], return true.
Return mediaDevices.{{MediaDevices/[[canExposeMicrophoneInfo]]}}.

To perform an is in view check, given mediaDevices, run the following steps:

If mediaDevices's [=relevant global object=]'s [=associated `Document`=] is [=Document/fully active=] and its [=Document/visibility state=] is `"visible"`, return `true`. Otherwise, return `false`.

To perform a has system focus check, given mediaDevices, run the following steps:

If mediaDevices's [=relevant global object=]'s [=navigable=]'s [=top-level traversable=] has system focus, return `true`. Otherwise, return `false`.

Set device information exposure

To set the device information exposure on mediaDevices, given a requestedTypes [=set=], and a boolean value, run the following steps:

If "video" is in requestedTypes, then set mediaDevices.{{MediaDevices/[[canExposeCameraInfo]]}} to value.
If "audio" is in requestedTypes, then set mediaDevices.{{MediaDevices/[[canExposeMicrophoneInfo]]}} to value.

A [=User Agent=] MAY at any point set the device information exposure back to false, for instance if the [=User Agent=] decides to revoke device access on a given {{Document}}.

Exposure decision algorithm for devices other than camera and microphone

The exposure decision algorithm for devices other than camera and microphone takes a device, microphoneList, cameraList and mediaDevices as input and returns a boolean to decide whether to expose information about device to the web page or not.

By default, it returns false.

Other specifications can define the algorithm for specific device types.

Context capturing state

To perform a context is capturing check for globalObject, run the following steps:

If globalObject is not a {{Window}}, then return false.
Let mediaDevices be globalObject's [=associated `MediaDevices`=].
For each source in mediaDevices.{{MediaDevices/[[mediaStreamTrackSources]]}}, run the following sub steps:
1. If source is [=source/stopped=] or [=source/muted=], abort these steps.
2. Let deviceId be source's device's deviceId.
3. If mediaDevices.{{MediaDevices/[[devicesLiveMap]]}}[deviceId] is true, return true.
Return false.

This algorithm covers all capture tracks, including microphone, camera and display.

Device Info

[Exposed=Window, SecureContext]
interface MediaDeviceInfo {
  readonly attribute DOMString deviceId;
  readonly attribute MediaDeviceKind kind;
  readonly attribute DOMString label;
  readonly attribute DOMString groupId;
  [Default] object toJSON();
};

Attributes

deviceId of type {{DOMString}}, readonly

The identifier of the represented device. The device MUST be uniquely identified by its identifier and its {{MediaDeviceInfo/kind}}.

To ensure stored identifiers are recognized, the identifier MUST be the same in {{Document}}s of the [=same origin=] in [=top-level traversables=]. In [=child navigables=], the decision of whether or not the identifier is the same across documents, MUST follow the [=User Agent=]'s partitioning rules for storage (such as {{WindowLocalStorage/localStorage}}), if any, to not interfere with mitigations for cross-site correlation. If the identifier can uniquely identify the user, then it MUST be un-guessable in documents from other origins to prevent the identifier from being used to correlate the same user across different origins. An identifier can be reused across origins as long as it is not tied to the user and can be guessed by other means, like the User-Agent string.

If any local devices have been attached to a live {{MediaStreamTrack}} in a page from this origin, or [=stored permission=] to access local devices has been granted to this origin, then this identifier MUST be persisted, except as detailed below. Unique and stable identifiers let the application save, identify the availability of, and directly request specific sources, across multiple visits.

However, as long as no local device has been attached to a live MediaStreamTrack in a page from this origin, and no [=stored permission=] to access local devices has been granted to this origin, then the [=User Agent=] MAY clear this identifier once the last browsing session from this origin has been closed. If the [=User Agent=] chooses not to clear the identifier in this condition, then it MUST provide for the user to visibly inspect and delete the identifier, like a cookie.

Since {{deviceId}} may persist across browsing sessions and to reduce its potential as a fingerprinting mechanism, {{deviceId}} is to be treated as other persistent storage mechanisms such as cookies [[COOKIES]], in that [=User Agents=] MUST NOT persist device identifiers for sites that are blocked from using cookies, and [=User Agents=] MUST rotate per-origin device identifiers when other persistent storage are cleared.

kind of type {{MediaDeviceKind}}, readonly

The kind of the represented device.

label of type {{DOMString}}, readonly

A label describing this device (for example "External USB Webcam"). This label is intended to allow the end user to tell the difference between devices. Applications can’t assume that the label contains any specific information, such as the device type or model. If the device has no associated label, then this attribute MUST return the empty string.

groupId of type {{DOMString}}, readonly

The group identifier of the represented device. Two devices have the same group identifier if they belong to the same physical device. For example, the audio input and output devices representing the speaker and microphone of the same headset have the same groupId.

The group identifier MUST be uniquely generated for each document.

Methods

toJSON: When called, run [[WEBIDL]]'s default toJSON steps.

enum MediaDeviceKind {
  "audioinput",
  "audiooutput",
  "videoinput"
};

MediaDeviceKind Enumeration description
audioinput	Represents an audio input device; for example a microphone.
audiooutput	Represents an audio output device; for example a pair of headphones.
videoinput	Represents a video input device; for example a webcam.

Input-specific Device Info

The InputDeviceInfo interface gives access to the capabilities of the input device it represents.

[Exposed=Window, SecureContext]
interface InputDeviceInfo : MediaDeviceInfo {
  MediaTrackCapabilities getCapabilities();
};

Methods

getCapabilities

Returns a {{MediaTrackCapabilities}} object describing the primary audio or video track of a device's {{MediaStream}} (according to its {{MediaStreamTrack/kind}} value), in the absence of any user-supplied constraints. These capabilities MUST be identical to those that would have been obtained by calling {{MediaStreamTrack/getCapabilities()}} on the first {{MediaStreamTrack}} of this type in a {{MediaStream}} returned by getUserMedia({deviceId: id}) where id is the value of the {{MediaDeviceInfo/deviceId}} attribute of this {{MediaDeviceInfo}}.

If no access has been granted to any local devices and this {{InputDeviceInfo}} has been filtered with respect to unique identifying information (see above description of {{MediaDevices/enumerateDevices()}} result), then this method returns an empty dictionary.

The {{devicechange}} event uses the {{DeviceChangeEvent}} interface.

[Exposed=Window]
interface DeviceChangeEvent : Event {
  constructor(DOMString type, DeviceChangeEventInit eventInitDict);
  [SameObject] readonly attribute FrozenArray<MediaDeviceInfo> devices;
};

Constructors

constructor(): Initialize [=this=].{{DeviceChangeEvent/devices}} to the result of [=creating a frozen array=] from eventInitDict.{{DeviceChangeEventInit/devices}}.

Attributes

devices of type FrozenArray<{{MediaDeviceInfo}}>, readonly: The {{devices}} attribute returns an array of {{MediaDeviceInfo}} objects representing the list of available devices at this time.

dictionary DeviceChangeEventInit : EventInit {
  sequence<MediaStream> streams = [];
};

Dictionary DeviceChangeEventInit Members

devices of type sequence<{{MediaDeviceInfo}}>, defaulting to []: The {{devices}} member is an array of {{MediaDeviceInfo}} objects representing the available devices.

Obtaining local multimedia content

This section extends {{Navigator}} and {{MediaDevices}} with APIs to request permission to access media input devices available to the [=User Agent=].

Alternatively, a local {{MediaStream}} can be captured from certain types of DOM elements, such as the video element [[?mediacapture-fromelement]]. This can be useful for automated testing.

{{MediaDevices}} Interface Extensions

The definition of {{Navigator/getUserMedia()}} in this section reflects two major changes from the method definition that has existed under {{Navigator}} for many months.

First, the official definition for the {{MediaDevices/getUserMedia()}} method, and the one which developers are encouraged to use, is now the one defined here under {{MediaDevices}}. This decision reflected consensus as long as the original API remained available at Navigator.getUserMedia under the {{Navigator}} object for backwards compatibility reasons, since the working group acknowledges that early users of these APIs have been encouraged to define getUserMedia as "var getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia;" in order for their code to be functional both before and after official implementations of getUserMedia() in popular [=User Agents=]. To ensure functional equivalence, the getUserMedia() method under {{Navigator}} is defined in terms of the method here.

Second, the method defined here is Promises-based, while the one defined under {{Navigator}} is currently still callback-based. Developers expecting to find getUserMedia() defined under Navigator are strongly encouraged to read the detailed Note given there.

The {{MediaDevices/getSupportedConstraints}} method is provided to allow the application to determine which constraints the [=User Agent=] recognizes. Applications may need this information to use required constraints reliably or get predictable results from combinatory logic in advanced constraints.

partial interface MediaDevices {
  MediaTrackSupportedConstraints getSupportedConstraints();
  Promise<MediaStream> getUserMedia(optional MediaStreamConstraints constraints = {});
};

Methods

getSupportedConstraints

Returns a dictionary whose members are the constrainable properties known to the [=User Agent=]. A supported constrainable property MUST be represented and any constrainable properties not supported by the [=User Agent=] MUST NOT be present in the returned dictionary. The values returned represent what the [=User Agent=] implements and will not change during a browsing session.

getUserMedia

Prompts the user for permission to use their Web cam or other video or audio input.

The constraints argument is a dictionary of type {{MediaStreamConstraints}}.

This method returns a promise. The promise will be [=upon fulfillment|fulfilled=] with a suitable {{MediaStream}} object if the user accepts valid tracks as described below.

The promise will be rejected if there is a failure in finding valid tracks or if the user denies permission, as described below.

When the getUserMedia() method is called, the [=User Agent=] MUST run the following steps:

Let constraints be the method's first argument.
Let requestedMediaTypes be the set of media types in constraints with either a dictionary value or a value of true.
If requestedMediaTypes is the empty set, return a promise rejected with a {{TypeError}}. The word "optional" occurs in the WebIDL due to WebIDL rules, but the argument MUST be supplied in order for the call to succeed.
Let document be the [=relevant global object=]'s [=associated `Document`=].
If document is NOT [=Document/fully active=], return a promise rejected with a {{DOMException}} object whose {{DOMException/name}} attribute has the value {{"InvalidStateError"}}.
If requestedMediaTypes contains "audio" and document is not [=allowed to use=] the feature identified by the "microphone" permission name, jump to the step labeled Permission Failure below.
If requestedMediaTypes contains "video" and document is not [=allowed to use=] the feature identified by the "camera" permission name, jump to the step labeled Permission Failure below.
Let mediaDevices be [=this=].
Let isInView be the result of the [= Document/is in view =] algorithm.
Let p be a new promise.
Run the following steps in parallel:
1. While isInView is `false`, the [=User Agent=] MUST wait to proceed to the next step until a task queued to set isInView to the result of the [=Document/is in view=] algorithm, would set isInView to `true`.
2. Let finalSet be an (initially) empty set.
3. For each media type kind in requestedMediaTypes, run the following steps:
  1. For each possible configuration of each possible source device of media of type kind, conceive a candidate as a placeholder for an eventual {{MediaStreamTrack}} holding a source device and configured with a settings dictionary comprised of its specific settings.
    
    Call this set of candidates the candidateSet.
    
    If candidateSet is the empty set, jump to the step labeled NotFound Failure below.
  2. If the value of the kind entry of constraints is true, set CS to the empty constraint set (no constraint). Otherwise, continue with CS set to the value of the kind entry of constraints.
  3. Remove any constrainable property inside of CS that are not defined for {{MediaStreamTrack}} objects of type kind. This means that audio-only constraints inside of "video" and video-only constraints inside of "audio" are simply ignored rather than causing OverconstrainedError.
  4. If CS contains a member that is a required constraint and whose name is not in the list of allowed required constraints for device selection, then [= reject =] p with a {{TypeError}}, and abort these steps.
  5. Run the SelectSettings algorithm on each candidate in candidateSet with CS as the constraint set. If the algorithm returns undefined, remove the candidate from candidateSet. This eliminates devices unable to satisfy the constraints, by verifying that at least one settings dictionary exists that satisfies the constraints.
    
    If candidateSet is the empty set, let failedConstraint be any required constraint whose fitness distance was infinity for all settings dictionaries examined while executing the SelectSettings algorithm, or "" if there isn't one, and jump to the step labeled Constraint Failure below.
    
    This error gives information about what the underlying device is not capable of producing, before the user has given any authorization to any device, and can thus be used as a fingerprinting surface.
  6. Read the current [=permission state=] for all candidate devices in candidateSet that are not attached to a live {{MediaStreamTrack}} in the current {{Document}}. Remove from candidateSet any candidate whose device's permission state is {{PermissionState/"denied"}}.
    
    If candidateSet is now empty, indicating that all devices of this type are in state {{PermissionState/"denied"}}, jump to the step labeled PermissionFailure below.
  7. Optionally, e.g., based on a previously-established user preference, for security reasons, or due to platform limitations, jump to the step labeled Permission Failure below.
  8. Add all candidates from candidateSet to finalSet.
4. Let stream be a new and empty {{MediaStream}} object.
5. For each media type kind in requestedMediaTypes, run the following sub steps, preferably at the same time:
  
  [=User Agents=] are encouraged to bundle concurrent requests for different kinds of media into a single user-facing permission prompt.
  1. [=Request permission to use=] a {{PermissionDescriptor}} with its {{PermissionDescriptor/name}} member set to the permission name associated with kind (e.g. "camera" for "video", "microphone" for "audio"), while considering all devices attached to a live and same-permission {{MediaStreamTrack}} in the current {{Document}} to have permission status {{PermissionState/"granted"}}, resulting in a set of provided media. Same-permission in this context means a {{MediaStreamTrack}} that required the same level of permission to obtain as what is being requested (e.g. not isolated).
    
    When asking the user’s permission, the [=User Agent=] MUST disclose whether permission will be granted only to the device chosen, or to all devices of that kind.
    
    If the user never responds, this algorithm stalls on this step.
  2. If the result of the request is {{PermissionState/"denied"}}, jump to the step labeled Permission Failure below.
  3. Let hasSystemFocus be `false`.
  4. While hasSystemFocus is `false`, the [=User Agent=] MUST wait to proceed to the next step until a task queued to set hasSystemFocus to the result of the [=has system focus=] algorithm, would set hasSystemFocus to `true`.
  5. Let finalCandidate be the provided media, which MUST be precisely one candidate of type kind from finalSet. The decision of which candidate to choose from the finalSet is completely up to the [=User Agent=] and may be determined by asking the user.
    
    The [=User Agent=] SHOULD use the value of the computed fitness distance from the SelectSettings algorithm as an input to the selection algorithm. However, it MAY also use other internally-available information about the devices, such as user preference.
    
    [=User Agents=] are encouraged to default to using the user's primary or system default device for kind (when possible). [=User Agents=] MAY allow users to use any media source, including pre-recorded media files.
  6. The result of the request is {{PermissionState/"granted"}}. If a hardware error such as an OS/program/webpage lock prevents access, remove the corresponding candidate from finalSet. If finalSet has no candidates of type kind, [= reject =] p with a new {{DOMException}} object whose {{DOMException/name}} attribute has the value {{"NotReadableError"}} and abort these steps. Otherwise, restart these sub steps with the updated finalSet.
    
    If device access fails for any reason other than those listed above, remove the corresponding candidate from finalSet. If finalSet has no candidates of type kind, [= reject =] p with a new {{DOMException}} object whose {{DOMException/name}} attribute has the value {{"AbortError"}} and abort these steps. Otherwise, restart these sub steps with the updated finalSet.
  7. Let grantedDevice be finalCandidate's source device.
  8. Using grantedDevice's deviceId, deviceId, set mediaDevices.{{MediaDevices/[[devicesLiveMap]]}}[deviceId] to true, if it isn’t already true, and set mediaDevices.{{MediaDevices/[[devicesAccessibleMap]]}}[deviceId] to true, if it isn’t already true.
  9. Let track be the result of [=create a MediaStreamTrack|creating a MediaStreamTrack=] with grantedDevice and mediaDevices. The source of the {{MediaStreamTrack}} MUST NOT change.
  10. Add track to stream's track set.
6. Run the ApplyConstraints algorithm on all tracks in stream with the appropriate constraints. If any of them returns something other than undefined, let failedConstraint be that result and jump to the step labeled Constraint Failure below.
7. For each track in stream, [=tie track source to `MediaDevices`=] with track.{{MediaStreamTrack/[[Source]]}} and mediaDevices.
8. [=Set the device information exposure=] on mediaDevices with requestedMediaTypes and true.
9. [= Resolve =] p with stream and abort these steps.
10. NotFound Failure:
  1. If [=getUserMedia specific failure is allowed=] given requestedMediaTypes returns false, jump to the step labeled Permission Failure below.
  2. [=Reject=] p with a new {{DOMException}} object whose {{DOMException/name}} attribute has the value {{"NotFoundError"}}.
11. Constraint Failure:
  1. If [=getUserMedia specific failure is allowed=] given requestedMediaTypes returns false, jump to the step labeled Permission Failure below.
  2. Let message be either undefined or an informative human-readable message, let constraint be failedConstraint if [=device information can be exposed=] is true, or "" otherwise.
  3. [=Reject=] p with a new OverconstrainedError created by calling OverconstrainedError(constraint, message).
12. Permission Failure: [= Reject =] p with a new {{DOMException}} object whose {{DOMException/name}} attribute has the value {{"NotAllowedError"}}.
Return p.

To check whether getUserMedia specific failure is allowed, given requestedMediaTypes, run the following steps:

If requestedMediaTypes contains "audio", read the [=permission state=] for the descriptor whose name is "microphone". If the result of the request is {{PermissionState/"denied"}}, return false.
If requestedMediaTypes contains "video", read the [=permission state=] for the descriptor whose name is "camera". If the result of the request is {{PermissionState/"denied"}}, return false.
Return true.

In the algorithm above, constraints are checked twice - once at device selection, and once after access approval. Time may have passed between those checks, so it is conceivable that the selected device is no longer suitable. In this case, a NotReadableError will result.

The allowed required constraints for device selection contains the following constraint names: width, height, aspectRatio, frameRate, facingMode, resizeMode, sampleRate, sampleSize, echoCancellation, autoGainControl, noiseSuppression, latency, channelCount, deviceId, groupId.

The MediaStreamConstraints dictionary is used to instruct the [=User Agent=] what sort of {{MediaStreamTrack}}s to include in the MediaStream returned by {{MediaDevices/getUserMedia()}}.

dictionary MediaStreamConstraints {
  (boolean or MediaTrackConstraints) video = false;
  (boolean or MediaTrackConstraints) audio = false;
};

Dictionary MediaStreamConstraints Members

video of type ({{boolean}} or {{MediaTrackConstraints}}), defaulting to false: If true, it requests that the returned MediaStream contain a video track. If a Constraints structure is provided, it further specifies the nature and settings of the video Track. If false, the {{MediaStream}} MUST NOT contain a video Track.
audio of type ({{boolean}} or {{MediaTrackConstraints}}), defaulting to false: If true, it requests that the returned MediaStream contain an audio track. If a Constraints structure is provided, it further specifies the nature and settings of the audio Track. If false, the MediaStream MUST NOT contain an audio Track.

Legacy GetUserMedia interface

The definition of getUserMedia() in this section reflects the call format that was originally proposed; it is only documented here for browsers that wish to retain backwards compatibility. It differs from the recommended interface in two important ways.

First, the official definition for the getUserMedia() method, and the one which developers are encouraged to use, is now at {{MediaDevices}}. This decision reflected consensus as long as the original API remained available here under the Navigator object for backwards compatibility reasons, since the working group acknowledges that early users of these APIs have been encouraged to define getUserMedia as "var getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia;" in order for their code to be functional both before and after official implementations of getUserMedia() in popular browsers. To ensure functional equivalence, the getUserMedia() method here is defined in terms of the method under MediaDevices.

Second, the decision to change all other callback-based methods in the specification to be based on Promises instead required that the navigator.getUserMedia() definition reflect this in its use of navigator.mediaDevices.getUserMedia(). Because navigator.getUserMedia() is now the only callback-based method remaining in the specification, there is ongoing discussion as to a) whether it still belongs in the specification, and b) if it does, whether its syntax should remain callback-based or change in some way to use Promises. Input on these questions is encouraged, particularly from developers actively using today's implementations of this functionality.

Note that the other methods that changed from a callback-based syntax to a Promises-based syntax were not considered to have been implemented widely enough in any form to have to consider legacy usage.

Implementations do not need to implement this interface in order to be considered compliant.

Interface definition

partial interface Navigator {
  [SecureContext] undefined getUserMedia(MediaStreamConstraints constraints,
                                    NavigatorUserMediaSuccessCallback successCallback,
                                    NavigatorUserMediaErrorCallback errorCallback);
};

Methods

getUserMedia

Prompts the user for permission to use their Web cam or other video or audio input.

The constraints argument is a dictionary of type {{MediaStreamConstraints}}.

The successCallback will be invoked with a suitable {{MediaStream}} object as its argument if the user accepts valid tracks as described in {{MediaDevices/getUserMedia()}} on {{MediaDevices}}.

The errorCallback will be invoked if there is a failure in finding valid tracks or if the user denies permission, as described in {{MediaDevices/getUserMedia()}} on {{MediaDevices}}.

When the {{getUserMedia()}} method is called, the User Agent MUST run the following steps:

Let constraints be the method's first argument.
Let successCallback be the callback indicated by the method's second argument.
Let errorCallback be the callback indicated by the method's third argument.
Run the steps specified by the getUserMedia() algorithm with constraints as the argument, and let p be the resulting promise.
[=Upon fulfillment=] of p with value stream, run the following step:
1. Invoke successCallback with stream as the argument.
[=Upon rejection=] of p with reason r, run the following step:
1. Invoke errorCallback with r as the argument.

NavigatorUserMediaSuccessCallback

callback NavigatorUserMediaSuccessCallback = undefined (MediaStream stream);

Callback NavigatorUserMediaSuccessCallback Parameters

stream of type {{MediaStream}}: {{MediaStream}} object representing the stream to which the user granted permission as described in the {{Navigator.getUserMedia}} algorithm.

NavigatorUserMediaErrorCallback

callback NavigatorUserMediaErrorCallback = undefined (DOMException error);

Callback NavigatorUserMediaErrorCallback Parameters

error of type {{DOMException}}: Error in obtaining a {{MediaStream}} as described in the failure steps of the {{Navigator.getUserMedia}} algorithm.

Implementation Suggestions

Resource reservation

The [=User Agent=] is encouraged to reserve resources when it has determined that a given call to {{MediaDevices/getUserMedia()}} will be successful. It is preferable to reserve the resource prior to resolving the returned promise. Subsequent calls to {{MediaDevices/getUserMedia()}} (in this page or any other) should treat the resource that was previously allocated, as well as resources held by other applications, as busy. Resources marked as busy should not be provided as sources to the current web page, unless specified by the user. Optionally, the [=User Agent=] may choose to provide a stream sourced from a busy source but only to a page whose origin matches the owner of the original stream that is keeping the source busy.

This document recommends that in the permission grant dialog or device selection interface (if one is present), the user be allowed to select any available hardware as a source for the stream requested by the page (provided the resource is able to fulfill any specified required constraints). Although not specifically recommended as best practice, note that some [=User Agents=] may support the ability to substitute a video or audio source with local files and other media. A file picker may be used to provide this functionality to the user.

This document also recommends that the user be shown all resources that are currently busy as a result of prior calls to getUserMedia() (in this page or any other page that is still alive) and be allowed to terminate that stream and utilize the resource for the current page instead. If possible in the current operating environment, it is also suggested that resources currently held by other applications be presented and treated in the same manner. If the user chooses this option, the track corresponding to the resource that was provided to the page whose stream was affected must be removed.

Stored Permissions

When permission is requested for a device, the [=User Agent=] may choose to store this permission for later use by the same origin, so that the user does not need to grant permission again at a later time. It is a [=User Agent=] choice whether it offers functionality to store permission to each device separately, all devices of a given class, or all devices; the choice needs to be apparent to the user, and permission must have been granted for the entire set whose permission is being stored, e.g., to store permission to use all cameras the user must have given permission to use all cameras and not just one.

As described, this specification does not dictate whether or not granting permission results in a stored permission. When permission is not stored, permission will last only until such time as all MediaStreamTracks sourced from that device have been stopped.

Handling multiple devices

A MediaStream may contain more than one video and audio track. This makes it possible to include video from two or more webcams in a single stream object, for example. However, the current API does not allow a page to express a need for multiple video streams from independent sources.

It is recommended for multiple calls to getUserMedia() from the same page to be allowed as a way for pages to request multiple discrete video and/or audio streams.

Note also that if multiple getUserMedia() calls are done by a page, the order in which they request resources, and the order in which they complete, is not constrained by this specification.

A single call to getUserMedia() will always return a stream with either zero or one audio tracks, and either zero or one video tracks. If a script calls getUserMedia() multiple times before reaching a stable state, this document advises the UI designer that the permission dialogs should be merged, so that the user can give permission for the use of multiple cameras and/or media sources in one dialog interaction. The constraints on each getUserMedia call can be used to decide which stream gets which media sources.

Generating deviceIds

An efficient practice for generating a {{MediaDeviceInfo/deviceId}} is to generate a cryptographic hash from a private key + (origin or origin + top-level origin, based on the user agents' partitioning rules) + salt + device's underlying (hardware) id in the driver, and present the resulting hash as an alphanumeric string. Using 32 bits or fewer for the hash is recommended, but not much lower, to avoid risk of collision.

A lower-entropy alternative, at the cost of storage, is to assign the numbers 0 through 255 randomly to each new device encountered for each origin or origin + top-level origin, based on the [=User Agent=]'s partitioning rules, retiring the number that hasn't been seen the longest if numbers run out.

Device muting initiated by [=User Agent=]

A track sourced by a camera or microphone may be forcibly [= MediaStreamTrack/muted =] by a [=User Agent=] at any time, in order to manage a user's privacy. However, doing so may create web compatibility issues, as well as leak information about user activity, so caution is advised.

Best practice is to mute a camera or microphone track in the following instances:

An OS-level event for which the [=User Agent=] already suspends media playback globally, but JavaScript is not suspended. The rationale is users may otherwise be surprised if capture were to continue in this situation (unless they've intentionally configured it this way). If the OS-level event already causes frames to stop coming in on the track, then no new information of user activity is revealed by this. Even when this is not the case, revealing that capture is ending seems like a reasonable privacy tradeoff compared to continuing capture in situations that may surprise users.
A web page not [=Document/is in view|in view=] [=MediaStreamTrack/enabled|re-enables=] a track when all tracks from that source are [=MediaStreamTrack/enabled|disabled=], in order to delay resumption of capture until the page [=Document/is in view=].

Best practice is to [= MediaStreamTrack/muted | unmute =] a camera or microphone track it previously [= MediaStreamTrack/muted =], in the following instances:

An OS-level event for which the [=User Agent=] already resumes media playback globally, and the page is visible to the user (e.g. not during a lock screen). [=User Agents=] may defer such action if it determines significant time has passed that may jeopardize a user's awareness of the earlier capture session.
A web page comes [=Document/is in view|into view=] and has one or more [=MediaStreamTrack/enabled=] tracks that are also [= MediaStreamTrack/muted =].

Constrainable Pattern

The Constrainable pattern allows applications to inspect and adjust the properties of objects implementing it (the constrainable object). It is broken out as a separate set of definitions so that it can be referred to by other specifications. The core concept is the Capability, which consists of a constrainable property of an object and the set of its possible values, which may be specified either as a range or as an enumeration. For example, a camera might be capable of framerates (a property) between 20 and 50 frames per second (a range) and may be able to be positioned (a property) facing towards the user, away from the user, or to the left or right of the user (an enumerated set). The application can examine a constrainable property's supported Capabilities via the getCapabilities() accessor.

The application can select the (range of) values it wants for an object's Capabilities by means of basic and/or advanced ConstraintSets and the applyConstraints() method. A ConstraintSet consists of the names of one or more properties of the object plus the desired value (or a range of desired values) for each property. Each of those property/value pairs can be considered to be an individual constraint. For example, the application may set a ConstraintSet containing two constraints, the first stating that the framerate of a camera be between 30 and 40 frames per second (a range) and the second that the camera should be facing the user (a specific value). How the individual constraints interact depends on whether and how they are given in the basic Constraint structure, which is a ConstraintSet with an additional 'advanced' property, or whether they are in a ConstraintSet in the advanced list. The behavior is as follows: all 'min', 'max', and 'exact' constraints in the basic Constraint structure are together treated as the required constraints, and if it is not possible to satisfy simultaneously all of those individual constraints for the indicated property names, the [=User Agent=] MUST [= reject =] the returned promise. Otherwise, it must apply the required constraints. Next, it will consider any ConstraintSets given in the advanced list, in the order in which they are specified, and will try to satisfy/apply each complete ConstraintSet (i.e., all constraints in the ConstraintSet together), but will skip a ConstraintSet if and only if it cannot satisfy/apply it in its entirety. Next, the [=User Agent=] MUST attempt to apply, individually, any 'ideal' constraints or a constraint given as a bare value for the property (referred to as optional basic constraints). Of these properties, it MUST satisfy the largest number that it can, in any order. Finally, the [=User Agent=] MUST [= resolve =] the returned promise.

Any constraint provided via this API will only be considered if the given constrainable property is supported by the [=User Agent=]. JavaScript application code is expected to first check, via getSupportedConstraints(), that all the named properties that are used are supported by the [=User Agent=]. The reason for this is that WebIDL drops any unsupported names from the dictionary holding the constraints, so the [=User Agent=] does not see them and the unsupported names end up being silently ignored. This will cause confusing programming errors as the JavaScript code will be setting constraints but the [=User Agent=] will be ignoring them. [=User Agents=] that support (recognize) the name of a required constraint but cannot satisfy it will generate an error, while [=User Agents=] that do not support the constrainable property will not generate an error.

The following examples may help to understand how constraints work. The first shows a basic Constraint structure. Three constraints are given, each of which the [=User Agent=] will attempt to satisfy individually. Depending upon the resolutions available for this camera, it is possible that not all three constraints can be satisfied at the same time. If so, the [=User Agent=] will satisfy two if it can, or only one if not even two constraints can be satisfied together. Note that if not all three can be satisfied simultaneously, it is possible that there is more than one combination of two constraints that could be satisfied. If so, the [=User Agent=] will choose.

const stream = await navigator.mediaDevices.getUserMedia({
  video: {
    width: 1280,
    height: 720,
    aspectRatio: 3/2
  }
});

This next example adds a small bit of complexity. The ideal values are still given for width and height, but this time with minimum requirements on each as well as a minimum frameRate that must be satisfied. If it cannot satisfy the frameRate, width or height minimum it will [= reject =] the promise. Otherwise, it will try to satisfy the width, height, and aspectRatio target values as well and then [= resolve =] the promise.

try {
  const stream = await navigator.mediaDevices.getUserMedia({
    video: {
      width: {min: 640, ideal: 1280},
      height: {min: 480, ideal: 720},
      aspectRatio: 3/2,
      frameRate: {min: 20}
    }
  });
} catch (error) {
  if (error.name != "OverconstrainedError") {
    throw error;
  }
  // Overconstrained. Try again with a different combination (no prompt was shown)
}

This example illustrates the full control possible with the Constraints structure by adding the 'advanced' property. In this case, the [=User Agent=] behaves the same way with respect to the required constraints, but before attempting to satisfy the ideal values it will process the 'advanced' list. In this example the 'advanced' list contains two ConstraintSets. The first specifies width and height constraints, and the second specifies an aspectRatio constraint. Note that in the advanced list, these bare values are treated as 'exact' values. This example represents the following: "I need my video to be at least 640 pixels wide and at least 480 pixels high. My preference is for precisely 1920x1280, but if you can't give me that, give me an aspectRatio of 4x3 if at all possible. If even that is not possible, give me a resolution as close to 1280x720 as possible."

try {
  const stream = await navigator.mediaDevices.getUserMedia({
    video: {
      width: {min: 640, ideal: 1280},
      height: {min: 480, ideal: 720},
      frameRate: {min: 30},
      advanced: [
        {width: 1920, height: 1280},
        {aspectRatio: 4/3},
        {frameRate: {min: 50}},
        {frameRate: {min: 40}}
      ]
    }
  });
} catch (error) {
  if (error.name != "OverconstrainedError") {
    throw error;
  }
  // Overconstrained. Try again with a different combination (no prompt was shown)
}

The ordering of advanced ConstraintSets is significant. In the preceding example it is impossible to satisfy both the 1920x1280 ConstraintSet and the 4x3 aspect ratio ConstraintSet at the same time. Since the 1920x1280 occurs first in the list, the [=User Agent=] will attempt to satisfy it first. Application authors can therefore implement a backoff strategy by specifying multiple advanced ConstraintSets for the same property. The application also specifies two more advanced ConstraintSets, the first asking for a frame rate greater than 50, the second asking for a frame rate greater than 40. If the [=User Agent=] is capable of setting a frame rate greater than 50, it will (and the subsequent ConstraintSet will be trivially satisfied). However, if the [=User Agent=] cannot set the frame rate above 50, it will skip that ConstraintSet and attempt to set the frame rate above 40. In case the [=User Agent=] cannot satisfy either of the two ConstraintSets, the 'min' value in the basic ConstraintSet insists on 30 as a lower bound. In other words, the [=User Agent=] would fail altogether if it couldn't get a value over 30, but would choose a value over 50 if possible, then try for a value over 40.

Note that, unlike basic constraints, the constraints within a ConstraintSet in the advanced list must be satisfied together or skipped together. Thus, {width: 1920, height: 1280} is a request for that specific resolution, not a request for that width or that height. One can think of the basic constraints as requesting an 'or' (non-exclusive) of the individual constraints, while each advanced ConstraintSet is requesting an 'and' of the individual constraints in the ConstraintSet. An application may inspect the full set of Constraints currently in effect via the getConstraints() accessor.

The specific value that the [=User Agent=] chooses for a constrainable property is referred to as a Setting. For example, if the application applies a ConstraintSet specifying that the frameRate must be at least 30 frames per second, and no greater than 40, the Setting can be any intermediate value, e.g., 32, 35, or 37 frames per second. The application can query the current settings of the object's constrainable properties via the {{MediaStreamTrack/getSettings()}} accessor.

Interface Definition

Although this specification formally defines ConstrainablePattern as a WebIDL interface, it is actually a template or pattern for other interfaces and cannot be inherited directly since the return values of the methods need to be extended, something WebIDL cannot do. Thus, each interface that wishes to make use of the functionality defined here will have to provide its own copy of the WebIDL for the functions and interfaces given here. However it can refer to the semantics defined here, which will not change. See MediaStreamTrack Interface Definition for an example of this.

This pattern relies on the constrainable object defining three internal slots:

A [[\Capabilities]] internal slot, initialized to a Capabilities dictionary describing the aggregate allowable values for each constrainable property exposed, as explained under Capabilities, or an empty dictionary if it has none.
A [[\Constraints]] internal slot, initialized to an empty Constraints dictionary.
A [[\Settings]] internal slot, initialized to a Settings dictionary describing the currently active settings values for each constrainable property exposed, as explained under Settings, or an empty dictionary if it has none.

Template:

[Exposed=Window]
interface ConstrainablePattern {
  Capabilities  getCapabilities();
  Constraints   getConstraints();
  Settings      getSettings();
  Promise<undefined> applyConstraints(optional Constraints constraints = {});
};

Methods

getCapabilities

The getCapabilities() method returns the dictionary of the names of the constrainable properties that the object supports. When invoked, the [=User Agent=] MUST return the value of the [[\Capabilities]] internal slot.

It is possible that the underlying hardware may not exactly map to the range defined for the constrainable property. Where this is possible, the entry SHOULD define how to translate and scale the hardware's setting onto the values defined for the property. For example, suppose that a hypothetical fluxCapacitance property ranges from -10 (min) to 10 (max), but there are common hardware devices that support only values of "off" "medium" and "full". The constrainable property definition might specify that for such hardware, the [=User Agent=] should map the range value of -10 to "off", 10 to "full", and 0 to "medium". It might also indicate that given a ConstraintSet imposing a strict value of 3, the [=User Agent=] should attempt to set the value of "medium" on the hardware, and that {{getSettings()}} should return a fluxCapacitance of 0, since that is the value defined as corresponding to "medium".

{{getConstraints}}

The getConstraints() method returns the Constraints that were the argument to the most recent successful invocation of the ApplyConstraints algorithm on the object, maintaining the order in which they were specified. Note that some of the advanced ConstraintSets returned may not be currently satisfied. To check which ConstraintSets are currently in effect, the application should use {{getSettings}}. Instead of returning the exact constraints as described above, the UA MAY return a constraint set that has the identical effect in all situations as the applied constraints. When invoked, the [=User Agent=] MUST return the value of the [[\Constraints]] internal slot.

getSettings

The getSettings() method returns the current settings of all the constrainable properties of the object, whether they are platform defaults or have been set by the ApplyConstraints algorithm. Note that a setting is a target value that complies with constraints, and therefore may differ from measured performance at times. When invoked, the User Agent MUST return the value of the [[\Settings]] internal slot.

applyConstraints

When the applyConstraints template method is invoked, the [=User Agent=] MUST run the following steps:

Let object be the object on which this method was invoked.
Let newConstraints be the argument to this method.
Let p be a new promise.
Run the following steps in parallel, maintaining the order of invocations if this method is called multiple times:
1. Let failedConstraint be the result of running the ApplyConstraints algorithm with newConstraints as the argument.
2. Let successfulSettings be the object's current settings after the algorithm in the above step has finished.
3. Queue a task that runs the following steps:
  1. If failedConstraint is not undefined, let message be either undefined or an informative human-readable message, [= reject =] p with a new OverconstrainedError created by calling OverconstrainedError(failedConstraint, message), and abort these steps. The existing constraints remain in effect in this case.
  2. Set object's [[\Constraints]] internal slot to newConstraints or a Constraints dictionary that has the identical effect in all situations as newConstraints.
  3. Set object's [[\Settings]] internal slot to successfulSettings.
  4. [= resolve =] p with undefined.
Return p.

The [=ApplyConstraints algorithm=] for applying constraints is stated below. Here are some preliminary definitions that are used in the statement of the algorithm:

We use the term settings dictionary for the set of values that might be applied as settings to the object.

For string valued constraints, we define "==" below to be true if one of the values in the sequence is exactly the same as the value being compared against.

We define the fitness distance between a settings dictionary and a constraint set CS as the sum, for each member (represented by a constraintName and constraintValue pair) which [= map/exist =]s in CS, of the following values:

If constraintName is not supported by the [=User Agent=], the fitness distance is 0.
If the constraint is required (constraintValue either contains one or more members named 'min', 'max', or 'exact', or is itself a bare value and bare values are to be treated as 'exact'), and the settings dictionary's constraintName member's value does not satisfy the constraint or doesn't [= map/exist =], the fitness distance is positive infinity.
If the constraint does not apply for this type of object, the fitness distance is 0 (that is, the constraint does not influence the fitness distance).
If constraintValue is a boolean, but the constrainable property is not, then the fitness distance is based on whether the settings dictionary's constraintName member [= map/exist | exists =] or not, from the formula
```
(constraintValue == exists) ? 0 : 1
```
If the settings dictionary's constraintName member does [= map/exist | not exist=], the fitness distance is 1.
If no ideal value is specified (constraintValue either contains no member named 'ideal', or, if bare values are to be treated as 'ideal', isn't a bare value), the fitness distance is 0.
For all positive numeric constraints (such as height, width, frameRate, aspectRatio, sampleRate and sampleSize), the fitness distance is the result of the formula
```
(actual == ideal) ? 0 : |actual - ideal| / max(|actual|, |ideal|)
```
For all string, enum and boolean constraints (e.g. deviceId, groupId, facingMode, resizeMode, echoCancellation), the fitness distance is the result of the formula
```
(actual == ideal) ? 0 : 1
```

More definitions:

We refer to each element of a ConstraintSet (other than the special term 'advanced') as a 'constraint' since it is intended to constrain the acceptable settings for the given property from the full list or range given in the corresponding Capability of the ConstrainablePattern object to a value that is within the range or list of values it specifies.
We refer to the "effective Capability" C of an object O as the possibly proper subset of the possible values of C (as returned by getCapabilities) taking into consideration environmental limitations and/or restrictions placed by other constraints. For example given a ConstraintSet that constrains the aspectRatio, height, and width properties, the values assigned to any two of the properties limit the effective Capability of the third. The set of effective Capabilities may be platform dependent. For example, on a resource-limited device it may not be possible to set properties P1 and P2 both to 'high', while on another less limited device, this may be possible.
A settings dictionary, which is a set of values for the constrainable properties of an object O, satisfies ConstraintSet CS if the fitness distance between the set and CS is less than infinity.
A set of ConstraintSets CS1...CSn (n >= 1) can be satisfied by an object O if it is possible to find a settings dictionary of O that satisfies CS1...CSn simultaneously.
To apply a set of ConstraintSets CS1...CSn to object O is to choose such a sequence of values that satisfy CS1...CSn and assign them as the settings for the properties of O.

We define the SelectSettings algorithm as follows:

Each constraint specifies one or more values (or a range of values) for its property. A property MAY appear more than once in the list of 'advanced' ConstraintSets. If an empty list has been given as the value for a constraint, it MUST be interpreted as if the constraint were not specified (in other words, an empty constraint == no constraint).
Note that unknown properties are discarded by WebIDL, which means that unknown/unsupported required constraints will silently disappear. To avoid this being a surprise, application authors are expected to first use the {{MediaDevices/getSupportedConstraints()}} method as shown in the Examples below.
Let object be the ConstrainablePattern object on which this algorithm is applied. Let copy be an unconstrained copy of object (i.e., copy should behave as if it were object with all ConstraintSets removed.)
For every possible settings dictionary of copy compute its fitness distance, treating bare values of properties as ideal values. Let candidates be the set of settings dictionaries for which the fitness distance is finite.
If candidates is empty, return undefined as the result of the SelectSettings algorithm.
Iterate over the 'advanced' ConstraintSets in newConstraints in the order in which they were specified. For each ConstraintSet:
1. compute the fitness distance between it and each settings dictionary in candidates, treating bare values of properties as exact.
2. If the fitness distance is finite for one or more settings dictionaries in candidates, keep those settings dictionaries in candidates, discarding others.
  
  If the fitness distance is infinite for all settings dictionaries in candidates, ignore this ConstraintSet.
Select one settings dictionary from candidates, and return it as the result of the SelectSettings algorithm. The [=User Agent=] MUST use one with the smallest fitness distance, as calculated in step 3. If more than one settings dictionary have the smallest fitness distance, the [=User Agent=] chooses one of them based on system default property values and [=User Agent=] default property values.

For any property with a system default value for the selected device, the system default value SHOULD be used if compatible with the above algorithm. This is usually the case for properties like sampleRate or sampleSize. Other properties, like echoCancellation or resizeMode do not usually have system default values. The [=User Agent=] defines its own default values for these properties. Implementors need to be cautious to select good default values since they will often have an impact on how media content is generated.

It is recommended to look at existing implementations to select meaningful default values. Note that default values may differ based on the system, for instance desktop vs. mobile. At time of writing, [=User Agent=] implementations tend to use the following default values, which were chosen for their suitability for using RTCPeerConnection as a sink:

width set to 640.
height set to 480.
frameRate set to 30.
echoCancellation set to true.

To apply the ApplyConstraints algorithm to an object, given newConstraints as an argument, the [=User Agent=] MUST run the following steps:

Let successfulSettings be the result of running the SelectSettings algorithm with newConstraints as the constraint set.
If successfulSettings is undefined, let failedConstraint be any required constraint whose fitness distance was infinity for all settings dictionaries examined while executing the SelectSettings algorithm, or "" if there isn't one, and then return failedConstraint and abort these steps.
In a single operation, remove the existing constraints from object, apply newConstraints, and apply successfulSettings as the current settings.
Return undefined.

Any implementation that has the same result as the algorithm above is an allowed implementation. For instance, the implementation may choose to keep track of the maximum and minimum values for a setting that are OK under the constraints considered, rather than keeping track of all possible values for the setting.

When picking a settings dictionary, the UA can use any information available to it. Examples of such information may be whether the selection is done as part of device selection in getUserMedia, whether the energy usage of the camera varies between the settings dictionaries, or whether using a settings dictionary will cause the device driver to apply resampling.

The [=User Agent=] MAY choose new settings for the constrainable properties of the object at any time. When it does so it MUST attempt to satisfy all current Constraints, in the manner described in the algorithm above, let successfulSettings be the resulting new settings, and queue a task to run the following steps:

Let object be the ConstrainablePattern object on which new settings for one or more constrainable properties have changed.
Set object's [[\Settings]] internal slot to successfulSettings.

An example of Constraints that could be passed into {{MediaStreamTrack/applyConstraints()}} or returned as a value of constraints is below. It uses the constrainable properties defined for camera-sourced {{MediaStreamTrack}}s. In this example, all constraints are ideal values, which means results are "best effort" based on the user's specific camera:

await track.applyConstraints({
  width: 1920,
  height: 1080,
  frameRate: 30,
});
const {width, height, frameRate} = track.getSettings();

console.log(`${width}x${height}x${frameRate}`); // 1920x1080x30, or it might be e.g.
                                                // 1280x720x30 as best effort

For finer control, an application can insist on an exact match, provided it's prepared to handle failure:

try {
  await track.applyConstraints({
    width: {exact: 1920},
    height: {exact: 1080},
    frameRate: {min: 25, ideal: 30, max: 30},
  });
  const {width, height, frameRate} = track.getSettings();

  console.log(`${width}x${height}x${frameRate}`); // 1920x1080x25-30!

} catch (error) {
  if (error.name != "OverconstrainedError") {
    throw error;
  }
  console.log(`This camera cannot produce the requested ${error.constraint}.`);
}

Constraints can also be passed into {{MediaDevices/getUserMedia}}, not just as an initialization convenience, but to influence device selection. In this case, [= list of inherent constrainable track properties | inherent constraints =] are also available.

Here's an example of using constraints to prefer a specific camera and microphone from a previous visit, with requirements on dimensions and a preference for stereo, to be applied once granted, and to help find suitable replacements in case the requested devices are no longer available (or in some user agents, overriden by the user).

try {
  const stream = await navigator.mediaDevices.getUserMedia({
    video: {
      deviceId: localStorage.camId,
      width: {min: 800, ideal: 1024, max: 1280},
      height: {min: 600}
    },
    audio: {
      deviceId: localStorage.micId,
      channelCount: 2
    }
  });

  // Granted. Store deviceIds for next time
  localStorage.camId = stream.getVideoTracks()[0].getSettings().deviceId;
  localStorage.micId = stream.getAudioTracks()[0].getSettings().deviceId;

} catch (error) {
  if (error.name != "OverconstrainedError") {
    throw error;
  }
  // Overconstrained. No suitable replacements found
}

The above example avoids using {exact: deviceId}, so that browsers can immediately offer a choice between different cameras if your preferred device is not available.

The example also stores the deviceIds on every grant, in case they represent a new choice.

In contrast, here's an example of using constraints to implement an in-content camera picker. In this case, we use exact and rely solely on a deviceId that comes from the user picking from a list of choices:

async function switchCameraTrack(freshlyChosenDeviceId, oldTrack) {
  if (isMobile) {
    oldTrack.stop(); // Some platforms can only open one camera at a time.
  }
  const stream = await navigator.mediaDevices.getUserMedia({
    video: {
      deviceId: {exact: freshlyChosenDeviceId}
    }
  });
  const [track] = stream.getVideoTracks();
  localStorage.camId = track.getSettings().deviceId;
  return track;
}

Here's an example asking for a back camera on a phone, ideally in 720p, but accepting anything close to that. Note how constraints on dimensions are specified in landscape mode:

async function getBackCamera() {
  return await navigator.mediaDevices.getUserMedia({
    video: {
      facingMode: {exact: 'environment'},
      width: 1280,
      height: 720
    }
  });
}

Here's an example of "I want a native 16:9 resolution near 720p, but with an exact frame rate of 10 even if not natively available". This needs to be done in two steps: One to discover the native mode, and a second step to apply the custom frame rate. This also shows how to derive constraints from current settings, which may be rotated:

async function nativeResolutionButDecimatedFrameRate() {
  const stream = await navigator.mediaDevices.getUserMedia({
    video: {
      resizeMode: 'none', // means native resolution and frame rate
      width: 1280,
      height: 720,
      aspectRatio: 16 / 9 // aspect ratios may not be exactly accurate
    }
  });
  const [track] = stream.getVideoTracks();
  const {width, height, aspectRatio} = track.getSettings();

  // Constraints are in landscape, while settings may be rotated (portrait)
  if (width < height) {
    [width, height] = [height, width];
    aspectRatio = 1 / aspectRatio;
  }

  await track.applyConstraints({
    resizeMode: 'crop-and-scale',
    width: {exact: width},
    height: {exact: height},
    frameRate: {exact: 10},
    aspectRatio,
  });

  return stream;
}

The above example assumes the primary orientation is landscape.

Here's an example showing how to use {{MediaDevices/getSupportedConstraints}}, for cases where a constraint being ignored due to lack of support in a user agent is not tolerated by the application:

async function getFrontCameraRes() {
  const supports = navigator.mediaDevices.getSupportedConstraints();

  for (const constraint of ["facingMode", "aspectRatio", "resizeMode"]) {
    if (!(constraint in supports) {
      throw new OverconstrainedError(constraint, "Not supported");
    }
  }
  return await navigator.mediaDevices.getUserMedia({
    video: {
      facingMode: {exact: 'user'},
      advanced: [
        {aspectRatio: 16/9, height: 1080, resizeMode: "none"},
        {aspectRatio: 4/3, width: 1280, resizeMode: "none"}
      ]
    }
  });
}

Constraint Types

The syntax for the specification of the set of valid inputs depends on the type of the values. In addition to the standard atomic types (boolean, long, double, DOMString), valid values include lists of any of the atomic types, plus min-max ranges, as defined below.

List values MUST be interpreted as disjunctions. For example, if a property 'facingMode' for a camera is defined as having valid values ["left", "right", "user", "environment"], this means that 'facingMode' can have the values "left", "right", "environment", and "user". Similarly Constraints restricting 'facingMode' to ["user", "left", "right"] would mean that the [=User Agent=] should select a camera (or point the camera, if that is possible) so that "facingMode" is either "user", "left", or "right". This Constraint would thus request that the camera not be facing away from the user, but would allow the [=User Agent=] to allow the user to choose other directions.

dictionary DoubleRange {
  double max;
  double min;
};

Dictionary DoubleRange Members

max of type {{double}}: The maximum valid value of this property.
min of type {{double}}: The minimum value of this Property.

dictionary ConstrainDoubleRange : DoubleRange {
  double exact;
  double ideal;
};

Dictionary ConstrainDoubleRange Members

exact of type {{double}}: The exact required value for this property.
ideal of type {{double}}: The ideal (target) value for this property.

dictionary ULongRange {
  [Clamp] unsigned long max;
  [Clamp] unsigned long min;
};

Dictionary ULongRange Members

max of type {{unsigned long}}: The maximum valid value of this property.
min of type {{unsigned long}}: The minimum value of this property.

dictionary ConstrainULongRange : ULongRange {
  [Clamp] unsigned long exact;
  [Clamp] unsigned long ideal;
};

Dictionary ConstrainULongRange Members

exact of type {{unsigned long}}: The exact required value for this property.
ideal of type {{unsigned long}}: The ideal (target) value for this property.

dictionary ConstrainBooleanParameters {
  boolean exact;
  boolean ideal;
};

Dictionary ConstrainBooleanParameters Members

exact of type {{boolean}}: The exact required value for this property.
ideal of type {{boolean}}: The ideal (target) value for this property.

dictionary ConstrainDOMStringParameters {
  (DOMString or sequence<DOMString>) exact;
  (DOMString or sequence<DOMString>) ideal;
};

Dictionary ConstrainDOMStringParameters Members

exact of type ({{DOMString}} or sequence<{{DOMString}}>): The exact required value for this property.
ideal of type ({{DOMString}} or sequence<{{DOMString}}>): The ideal (target) value for this property.

typedef ([Clamp] unsigned long or ConstrainULongRange) ConstrainULong;

Throughout this specification, the identifier ConstrainULong is used to refer to the ([Clamp] unsigned long or ConstrainULongRange) type.

typedef (double or ConstrainDoubleRange) ConstrainDouble;

Throughout this specification, the identifier ConstrainDouble is used to refer to the (double or ConstrainDoubleRange) type.

typedef (boolean or ConstrainBooleanParameters) ConstrainBoolean;

Throughout this specification, the identifier ConstrainBoolean is used to refer to the (boolean or ConstrainBooleanParameters) type.

typedef (DOMString or
         sequence<DOMString> or
         ConstrainDOMStringParameters) ConstrainDOMString;

Throughout this specification, the identifier ConstrainDOMString is used to refer to the (DOMString or sequence<DOMString> or ConstrainDOMStringParameters) type.

Capabilities

Capabilities is a dictionary containing one or more key-value pairs, where each key MUST be a constrainable property, and each value MUST be a subset of the set of values allowed for that property. The exact syntax of the value expression depends on the type of the property. The Capabilities dictionary specifies which constrainable properties that can be applied, as constraints, to the constrainable object. Note that the Capabilities of a constrainable object MAY be a subset of the properties defined in the Web platform, with a subset of the set values for those properties. Note that Capabilities are returned from the [=User Agent=] to the application, and cannot be specified by the application. However, the application can control the Settings that the [=User Agent=] chooses for constrainable properties by means of Constraints.

An example of a Capabilities dictionary is shown below. In this case, the constrainable object is a video source with a very limited set of Capabilities.

{
  frameRate: {min: 1.0, max: 60.0},
  facingMode: ['user', 'left']
}

The next example below points out that capabilities for range values provide ranges for individual constrainable properties, not combinations. This is particularly relevant for video width and height, since the ranges for width and height are reported separately. In the example, if the constrainable object can only provide 640x480 and 800x600 resolutions the relevant capabilities returned would be:

{
  width: {min: 640, max: 800},
  height: {min: 480, max: 600},
  aspectRatio: {min: 4/3, max: 4/3}
}

Note in the example above that the aspectRatio would make clear that arbitrary combination of widths and heights are not possible, although it would still suggest that more than two resolutions were available.

A specification using the Constrainable Pattern should not subclass the below dictionary, but instead provide its own definition. See {{MediaTrackCapabilities}} for an example.

Template:

dictionary Capabilities {};

Settings

Settings is a dictionary containing one or more key-value pairs. It MUST contain each key returned in getCapabilities() for which the property is defined on the object type it's returned on; for instance, an audio {{MediaStreamTrack}} has no "width" property. There MUST be a single value for each key and the value MUST be a member of the set defined for that property by getCapabilities(). The Settings dictionary contains the actual values that the User Agent has chosen for the object's constrainable properties. The exact syntax of the value depends on the type of the property.

A conforming [=User Agent=] MUST support all the constrainable properties defined in this specification.

An example of a Settings dictionary is shown below. This example is not very realistic in that a [=User Agent=] would actually be required to support more constrainable properties than just these.

{
  frameRate: 30.0,
  facingMode: 'user'
}

A specification using the Constrainable Pattern should not subclass the below dictionary, but instead provide its own definition. See {{MediaTrackSettings}} for an example.

Template:

dictionary Settings {};

Constraints and ConstraintSet

Due to the limitations of WebIDL, interfaces implementing the Constrainable Pattern cannot simply subclass Constraints and ConstraintSet as they are defined here. Instead they must provide their own definitions that follow this pattern. See MediaTrackConstraints for an example of this.

Template:

dictionary ConstraintSet {};

Each member of a ConstraintSet corresponds to a constrainable property and specifies a subset of the property's valid Capability values. Applying a ConstraintSet instructs the [=User Agent=] to restrict the settings of the corresponding constrainable properties to the specified values or ranges of values. A given property MAY occur both in the basic Constraints set and in the advanced ConstraintSets list, and MAY occur at most once in each ConstraintSet in the advanced list.

Template:

dictionary Constraints : ConstraintSet {
  sequence<ConstraintSet> advanced;
};

Dictionary Constraints Members

advanced of type sequence<{{ConstraintSet}}>: This is the list of ConstraintSets that the [=User Agent=] MUST attempt to satisfy, in order, skipping only those that cannot be satisfied. The order of these ConstraintSets is significant. In particular, when they are passed as an argument to applyConstraints, the [=User Agent=] MUST try to satisfy them in the order that is specified. Thus if advanced ConstraintSets C1 and C2 can be satisfied individually, but not together, then whichever of C1 and C2 is first in this list will be satisfied, and the other will not. The [=User Agent=] MUST attempt to satisfy all ConstraintSets in the list, even if some cannot be satisfied. Thus, in the preceding example, if constraint C3 is specified after C1 and C2, the [=User Agent=] will attempt to satisfy C3 even though C2 cannot be satisfied. Note that a given property name may occur only once in each ConstraintSet but may occur in more than one ConstraintSet.

Introduction

Terminology

MediaStream API

Introduction

{{MediaStream}}

Constructors

Attributes

Methods

{{MediaStreamTrack}}

Media Flow and Life-cycle

Media Flow

Life-cycle

Tracks and Constraints

Interface Definition

Attributes

Methods

MediaTrackSupportedConstraints

Dictionary {{MediaTrackSupportedConstraints}} Members

MediaTrackCapabilities

Dictionary {{MediaTrackCapabilities}} Members

MediaTrackConstraints

Dictionary {{MediaTrackConstraints}} Members

Dictionary {{MediaTrackConstraintSet}} Members

MediaTrackSettings

Dictionary {{MediaTrackSettings}} Members

Constrainable Properties

{{MediaStreamTrackEvent}}

Constructors

Attributes

Dictionary MediaStreamTrackEventInit Members

The model: sources, sinks, constraints, and settings

MediaStreams in Media Elements

Error Handling

OverconstrainedError Interface

Constructors

Attributes

Event summary

Enumerating Local Media Devices

`Navigator` Interface Extensions

Attributes

{{MediaDevices}}

Attributes

Methods

Access control model

Device information exposure

Set device information exposure

Exposure decision algorithm for devices other than camera and microphone

Context capturing state

Device Info

Attributes

Methods

Input-specific Device Info

Methods

{{DeviceChangeEvent}}

Constructors

Attributes

Dictionary DeviceChangeEventInit Members

Obtaining local multimedia content

{{MediaDevices}} Interface Extensions

Methods

{{MediaStreamConstraints}}

Dictionary MediaStreamConstraints Members

Legacy GetUserMedia interface

Interface definition

Methods

NavigatorUserMediaSuccessCallback

Callback NavigatorUserMediaSuccessCallback Parameters

NavigatorUserMediaErrorCallback

Callback NavigatorUserMediaErrorCallback Parameters

Implementation Suggestions

Constrainable Pattern

Interface Definition

Methods

Constraint Types

Dictionary DoubleRange Members

Dictionary ConstrainDoubleRange Members

Dictionary ULongRange Members

Dictionary ConstrainULongRange Members

Dictionary ConstrainBooleanParameters Members

Dictionary ConstrainDOMStringParameters Members