Initial Author of this Specification was Ian Hickson, Google Inc., with the following copyright statement:
© Copyright 2004-2011 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA. You are granted a license to use, reproduce and create derivative works of this document.
All subsequent changes since 26 July 2011 done by the W3C WebRTC Working Group are under the following Copyright:
© 2011 W3C® (MIT, ERCIM, Keio), All Rights Reserved. Document use rules apply.
For the entire publication on the W3C site the liability and trademark rules apply.
This document defines a set of APIs that allow local media, including audio and video, to be requested from a platform.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is not complete. It is subject to major changes and, while early experimentations are encouraged, it is therefore not intended for implementation. The API is based on preliminary work done in the WHATWG. The Media Capture Task Force expects this specification to evolve significantly based on:
This document was published by the Media Capture TF as an Editor's Draft. If you wish to make comments regarding this document, please send them to public-media-capture@w3.org (subscribe, archives). All feedback is welcome.
Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February "2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This section is non-normative.
Access to multimedia streams (video, audio, or both) from local devices (video cameras, microphones, Web cams) can have a number of uses, such as real-time communication, recording, surveillance.
This document defines the APIs used to get access to local devices (video cameras, microphones, Web cams) that can generate multimedia stream data. This document also defines the stream API by which JavaScript is able to manipulate the stream data or otherwise process it.
The
interface is used to represent streams of media data,
typically (but not necessarily) of audio and/or video content, e.g.
from a local camera. The data from a MediaStream
object does not necessarily have a canonical binary form;
for example, it could just be "the video currently coming from the
user’s video camera". This allows user agents to manipulate media
streams in whatever fashion is most suitable on the user’s
platform.MediaStream
Each
object can contain zero or more tracks, in particular audio
and video tracks. All tracks in a MediaStream are intended to be
synchronized when rendered. Different MediaStreams do not need to be
synchronized.MediaStream
Each track in a MediaStream object has a corresponding
object.MediaStreamTrack
A
represents content comprising one or more channels, where
the channels have a defined well known relationship to each other
(such as a stereo or 5.1 audio signal).MediaStreamTrack
A channel is the smallest unit considered in this API specification.
A
object has an input and an output. The input depends on how
the object was created: a MediaStream
object generated by a LocalMediaStream
getUserMedia()
call (which is described later in this document), for
instance, might take its input from the user’s local camera. The
output of the object controls how the object is used, e.g. what is
saved if the object is written to a file, what is displayed if the
object is used in a video
element
Each track in a
object can be disabled, meaning that it is muted in the
object’s output. All tracks are initially enabled.MediaStream
A
can be finished, indicating that its inputs have
forever stopped providing data.MediaStream
The output of a
object must correspond to the tracks in its input. Muted
audio tracks must be replaced with silence. Muted video tracks must be
replaced with blackness.MediaStream
A new
object can be created from existing MediaStream
objects using the MediaStreamTrack
MediaStream()
constructor. The constructor takes two lists of
objects as arguments; one for audio tracks and one for video
tracks. The lists can either be the track lists of another stream,
subsets of such lists, or compositions of MediaStreamTrack
objects from different MediaStreamTrack
objects.MediaStream
The ability to duplicate a
, i.e. create a new MediaStream
object from the track lists of an existing stream, allows
for greater control since separate MediaStream
instances can be manipulated and consumed individually.MediaStream
The
interface is used when the user agent is generating the
stream’s data (e.g. from a camera or streaming it from a local video
file).LocalMediaStream
When a
object is being generated from a local file (as opposed to a
live audio/video source), the user agent should stream the data from
the file in real time, not all at once. The LocalMediaStream
MediaStream
object is also used in contexts outside getUserMedia
,
such as [WEBRTC10]. Hence ensuring a realtime stream in both cases,
reduces the ease with which pages can distinguish live video from
pre-recorded video, which can help protect the user’s privacy.
The
MediaStream()
constructor takes two arguments. The arguments are two lists
with
objects which will be used to construct the audio and video
track lists of the new MediaStreamTrack
object. When the constructor is invoked, the UA must run the
following steps:MediaStream
Let audioTracks be the constructor’s first argument.
Let videoTracks be the constructor’s second argument.
Let stream be a newly constructed
object.MediaStream
Set stream’s label attribute to a newly generated value.
If audioTracks is not null, then run the following sub steps for each element track in audioTracks:
If track is of any other kind than
"audio
", then throw a SyntaxError
exception.
If track has the same underlying source as another element in stream’s audio track list, then abort these steps.
Add track to stream’s audio track list.
If videoTracks is not null, then run the following sub steps for each element track in videoTracks:
If track is of any other kind than
"video
", then throw a SyntaxError
exception.
If track has the same underlying source as another element in stream’s video track list, then abort these steps.
Add track to stream’s video track list.
A
can have multiple audio and video sources (e.g. because the
user has multiple microphones, or because the real source of the
stream is a media resource with many media tracks). The stream
represented by a MediaStream
thus has zero or more tracks.MediaStream
The tracks of a
are stored in two track lists represented by MediaStream
objects; one for audio tracks and one for video tracks. The
two track lists must contain the MediaStreamTrackList
objects that correspond to the tracks of the stream. The
relative order of all tracks in a user agent must be stable. Tracks
that come from a media resource whose format defines an order must be
in the order defined by the format; tracks that come from a media
resource whose format does not define an order must be in the relative
order in which the tracks are declared in that media resource. Within
these constraints, the order is user-agent defined.MediaStreamTrack
An object that reads data from the output of a
is referred to as a MediaStream
consumer. The list of MediaStream
consumers currently includes the media elements, MediaStream
PeerConnection
(specified in [WEBRTC10]).
consumers must be able to handle tracks being added and
removed. This behavior is specified per consumer.MediaStream
A
object is said to be finished when all tracks
belonging to the stream have ended. When this happens for any
reason other than the MediaStream
stop()
method being invoked, the user agent must queue a task that
runs the following steps:
If the object’s
ended
attribute has the value true already, then abort these
steps. (The
stop()
method was probably called just before the stream
stopped for other reasons, e.g. the user clicked an in-page stop
button and then the user-agent-provided stop button.)
Set the object’s
ended
attribute to true.
Fire a simple event named
ended
at the object.
If the end of the stream was reached due to a user request, the task source for this task is the user interaction task source. Otherwise the task source for this task is the networking task source.
[Constructor (MediaStreamTrackList? audioTracks, MediaStreamTrackList? videoTracks)]
interface MediaStream {
readonly attribute DOMString label;
readonly attribute MediaStreamTrackList
audioTracks;
readonly attribute MediaStreamTrackList
videoTracks;
attribute boolean ended;
attribute Function? onended;
};
audioTracks
of type MediaStreamTrackList
, readonlyReturns a
object representing the audio tracks that can be enabled
and disabled.MediaStreamTrackList
The
audioTracks
attribute must return an array host object for objects of
type
that is fixed length and read only.
The same object must be returned each time the attribute is
accessed.MediaStreamTrack
ended
of type booleanThe
MediaStream.ended
attribute must return true if the
has finished, and false otherwise.MediaStream
When a
object is created, its MediaStream
ended
attribute must be set to false, unless it is being
created using the
MediaStream()
constructor whose arguments are lists of
objects that are all ended, in which case the
MediaStreamTrack
object must be created with its MediaStream
ended
attribute set to true.
label
of type DOMString, readonlyWhen a
object is created, the user agent must generate a
globally unique identifier string, and must initialize the
object’s LocalMediaStream
label
attribute to that string. Such strings must only use
characters in the ranges U+0021, U+0023 to U+0027, U+002A to
U+002B, U+002D to U+002E, U+0030 to U+0039, U+0041 to U+005A,
U+005E to U+007E, and must be 36 characters long.
When a
is created from another using the MediaStream
MediaStream()
constructor, the
label
attribute is initialized to a newly generated value.
The
label
attribute must return the value to which it was
initialized when the object was created.
onended
of type Function, nullable
ended
, must be supported by all objects implementing the
MediaStream
interface.videoTracks
of type MediaStreamTrackList
, readonlyReturns a
object representing the video tracks that can be enabled
and disabled.MediaStreamTrackList
The
videoTracks
attribute must return an array host object for objects of
type
that is fixed length and read only.
The same object must be returned each time the attribute is
accessed.MediaStreamTrack
MediaStream
implements EventTarget;
Before the web application can access the users media input devices
it must let
getUserMedia()
create a
. Once the application is done using, e.g., a webcam and a
microphone, it may revoke its own access by calling LocalMediaStream
stop()
on the
.
LocalMediaStream
A web application may, once it has access to a
, use the LocalMediaStream
MediaStream()
constructor to construct additional
objects. Since a derived MediaStream
object is created from the tracks of an existing stream, it
cannot use any media input devices that have not been approved by the
user.MediaStream
interface LocalMediaStream : MediaStream
{
void stop ();
};
stop
When a
object’s
LocalMediaStream
stop()
method is invoked, the user agent must queue a task that
runs the following steps on every track:
Let track be the current
object.MediaStreamTrack
End track. The track start outputting only silence and/or blackness, as appropriate.
Dereference track’s underlying media source.
If the reference count of track’s underlying media source is greater than zero, then abort these steps.
Permanently stop the generation of data for track’s source. If the data is being generated from a live source (e.g. a microphone or camera), then the user agent should remove any active "on-air" indicator for that source. If the data is being generated from a prerecorded source (e.g. a video file), any remaining content in the file is ignored.
The task source for the tasks
queued for the
stop()
method is the DOM manipulation task source.
void
A
object represents a media source in the user agent. Several
MediaStreamTrack
objects can represent the same media source, e.g., when the
user chooses the same camera in the UI shown by two consecutive calls
to MediaStreamTrack
getUserMedia()
.
A
object can reference its media source in two ways, either
with a strong or a weak reference, depending on how the track was
created. For example, a track in a MediaStreamTrack
, derived from a MediaStream
with the LocalMediaStream
MediaStream()
constructor, has a weak reference to a local media source,
while a track in a
has a strong reference. This means that a track in a LocalMediaStream
, derived from a MediaStream
, will end if there is no non-ended track in a LocalMediaStream
which references the same local media source.LocalMediaStream
The concept with strong and weak references to media
sources allows the web application to derive new
objects from MediaStream
objects (created via LocalMediaStream
getUserMedia()
), and still be able to revoke all given permissions with
LocalMediaStream.stop()
.
A
object is said to end when the user agent learns
that no more data will ever be forthcoming for this track.MediaStreamTrack
When a
object ends for any reason (e.g. because the user rescinds
the permission for the page to use the local camera, or because the
data comes from a finite file and the file’s end has been reached and
the user has not requested that it be looped, or because the UA has
instructed the track to end for any reason, or because the reference
count of the track’s underlying media source has reached zero, it is
said to be ended. When track instance track ends
for any reason other than MediaStreamTrack
stop()
method being invoked on the
object that represents track, the user agent must
queue a task that runs the following steps:LocalMediaStream
If the track’s
readyState
attribute has the value
ENDED
(2) already, then abort these steps.
Set track’s
readyState
attribute to
ENDED
(2).
Fire a simple event named
ended
at the object.
If the end of the stream was reached due to a user request, the event source for this event is the user interaction event source.
interface MediaStreamTrack {
readonly attribute DOMString kind;
readonly attribute DOMString label;
attribute boolean enabled;
const unsigned short LIVE = 0;
const unsigned short MUTED = 1;
const unsigned short ENDED = 2;
readonly attribute unsigned short readyState;
attribute Function? onmute;
attribute Function? onunmute;
attribute Function? onended;
};
enabled
of type booleanThe
MediaStreamTrack.enabled
attribute, on getting, must return the last value to
which it was set. On setting, it must be set to the new value, and
then, if the
object is still associated with a track, must enable the
track if the new value is true, and disable it otherwise.MediaStreamTrack
Thus, after a
is disassociated from its track, its MediaStreamTrack
enabled
attribute still changes value when set, it just doesn’t
do anything with that new value.
kind
of type DOMString, readonlyThe
MediaStreamTrack.kind
attribute must return the string "audio
" if
the object’s corresponding track is or was an audio track,
"video
" if the corresponding track is or was a video
track, and a user-agent defined string otherwise.
label
of type DOMString, readonlyUser agents may label audio and video sources (e.g. "Internal
microphone" or "External USB Webcam"). The
MediaStreamTrack.label
attribute must return the label of the object’s
corresponding track, if any. If the corresponding track has or had
no label, the attribute must instead return the empty string.
Thus the
kind
and
label
attributes do not change value, even if the
object is disassociated from its corresponding
track.MediaStreamTrack
onended
of type Function, nullable
ended
, must be supported by all objects implementing the
MediaStreamTrack
interface.onmute
of type Function, nullable
muted
, must be supported by all objects implementing the
MediaStreamTrack
interface.onunmute
of type Function, nullable
unmuted
, must be supported by all objects implementing the
MediaStreamTrack
interface.readyState
of type unsigned short, readonlyThe
readyState
attribute represents the state of the track. It must
return the value to which the user agent last set it (as defined
below). It can have the following values: LIVE,
MUTED or ENDED.
When a
object is created, its MediaStreamTrack
readyState
is either
LIVE
(0) or
MUTED
(1), depending on the state of the track’s underlying
media source. For example, a track in a
, created with LocalMediaStream
getUserMedia()
, must initially have its
readyState
attribute set to
LIVE
(1).
ENDED
of type unsigned shortThe track has ended (the track’s underlying media source is no longer providing data, and will never provide more data for this track).
For example, a video track in a
finishes if the user unplugs the USB web camera that
acts as the track’s media source.LocalMediaStream
LIVE
of type unsigned shortThe track is active (the track’s underlying media source is making a best-effort attempt to provide data in real time).
The output of a track in the
LIVE
state can be switched on and off with the
enabled
attribute.
MUTED
of type unsigned shortThe track is muted (the track’s underlying media source is temporarily unable to provide data).
A
in a MediaStreamTrack
may be muted if the user temporarily revokes the web
application’s permission to use a media input device.LocalMediaStream
partial interface URL {
static DOMString createObjectURL (MediaStream
stream);
};
createObjectURL
Mints a Blob URL to refer to the given
.MediaStream
When the
createObjectURL()
method is called with a
argument, the user agent must return a unique Blob URL for the given MediaStream
. [FILE-API]MediaStream
For audio and video streams, the data exposed on that stream
must be in a format supported by the user agent for use in
audio
and video
elements.
A Blob URL is the
same as what the File API specification calls a Blob URI, except
that anything in the definition of that feature that refers to
File
and Blob
objects is hereby extended
to also apply to
and MediaStream
objects.LocalMediaStream
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
stream |
| ✘ | ✘ |
static DOMString
A
object’s corresponding MediaStreamTrackList
refers to the MediaStream
object which the current MediaStream
object is a property of.MediaStreamTrackList
interface MediaStreamTrackList {
readonly attribute unsigned long length;
MediaStreamTrack
item (unsigned long index);
void add (MediaStreamTrack
track);
void remove (MediaStreamTrack
track);
attribute Function? onaddtrack;
attribute Function? onremovetrack;
};
length
of type unsigned long, readonlyonaddtrack
of type Function, nullable
addtrack
, must be supported by all objects implementing the
MediaStreamTrackList
interface.onremovetrack
of type Function, nullable
removetrack
, must be supported by all objects implementing the
MediaStreamTrackList
interface.add
Adds the given
to this MediaStreamTrack
according to the ordering rules for tracks.MediaStreamTrackList
When the
add()
method is invoked, the user agent must run the following
steps:
Let track be the
argument.MediaStreamTrack
Let stream be the
object’s corresponding
MediaStreamTrackList
object.MediaStream
If stream is finished, throw an
INVALID_STATE_ERR
exception.
If track is already in the
, object’s internal list, then abort these steps.MediaStreamTrackList
Add track to the end of the
object’s internal list.MediaStreamTrackList
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
track |
| ✘ | ✘ |
void
item
MediaStreamTrack
object at the specified index.Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
index | unsigned long | ✘ | ✘ |
MediaStreamTrack
remove
Removes the given
from this MediaStreamTrack
.MediaStreamTrackList
When the
remove()
method is invoked, the user agent must run the following
steps:
Let track be the
argument.MediaStreamTrack
Let stream be the
object’s corresponding
MediaStreamTrackList
object.MediaStream
If stream is finished, throw an
INVALID_STATE_ERR
exception.
If track is not in the
, object’s internal list, then abort these steps.MediaStreamTrackList
Remove track from the
object’s internal list.MediaStreamTrackList
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
track |
| ✘ | ✘ |
void
dictionary MediaStreamConstraints {
(DOMString or dictionary MediaTrackConstraints) video;
(DOMString or dictionary MediaTrackConstraints) audio;
};
MediaStreamConstraints
Membersaudio
of type (DOMString or dictionary MediaTrackConstraints)video
of type (DOMString or dictionary MediaTrackConstraints)dictionary MediaTrackConstraints {
dictionary MediaTrackConstraintSet? mandatory;
sequence<MediaTrackConstraint>? optional;
};
MediaTrackConstraints
Membersmandatory
of type dictionary MediaTrackConstraintSet, nullableoptional
of type sequence<MediaTrackConstraint>, nullableA MediaTrackConstraintSet is a dictionary containing one or more key-value pairs, where each key must be a valid registered constraint name in the IANA-hosted RTCWeb Media Constraints registry [RTCWEB-CONSTRAINTS] and its value should be as defined in the associated reference[s] given in the registry.
A MediaTrackConstraint is a dictionary containing exactly one key-value pair, where the key must be a valid registered constraint name in the IANA-hosted RTCWeb Media Constraints registry [RTCWEB-CONSTRAINTS] and the value should be as defined in the associated reference[s] given in the registry.
TBD - adjust MediaTrackConstraints to also allow true/false values.
This sample code exposes a button. When clicked, the button is disabled and the user is prompted to offer a stream. The user can cause the button to be re-enabled by providing a stream (e.g. giving the page access to the local camera) and then disabling the stream (e.g. revoking that access).
<input type="button" value="Start" onclick="start()" id="startBtn"> <script> var startBtn = document.getElementById('startBtn'); function start() { navigator.getUserMedia({audio:true, video:true}, gotStream); startBtn.disabled = true; } function gotStream(stream) { stream.onended = function () { startBtn.disabled = false; }; } </script>
This example allows people to take photos of themselves from the local video camera.
<article> <style scoped> video { transform: scaleX(-1); } p { text-align: center; } </style> <h1>Snapshot Kiosk</h1> <section id="splash"> <p id="errorMessage">Loading...</p> </section> <section id="app" hidden> <p><video id="monitor" autoplay></video> <canvas id="photo"></canvas> <p><input type=button value="📷" onclick="snapshot()"> </section> <script> navigator.getUserMedia({video:true}, gotStream, noStream); var video = document.getElementById('monitor'); var canvas = document.getElementById('photo'); function gotStream(stream) { video.src = URL.createObjectURL(stream); video.onerror = function () { stream.stop(); }; stream.onended = noStream; video.onloadedmetadata = function () { canvas.width = video.videoWidth; canvas.height = video.videoHeight; document.getElementById('splash').hidden = true; document.getElementById('app').hidden = false; }; } function noStream() { document.getElementById('errorMessage').textContent = 'No camera available.'; } function snapshot() { canvas.getContext('2d').drawImage(video, 0, 0); } </script> </article>
This section is non-normative.
IANA is requested to register the following constraints as specified in [RTCWEB-CONSTRAINTS]:
This section will be removed before publication.
-
No informative references.