Initial Author of this Specification was Ian Hickson, Google Inc., with the following copyright statement:
© Copyright 2004-2011 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA. You are granted a license to use, reproduce and create derivative works of this document.
All subsequent changes since 26 July 2011 done by the W3C WebRTC Working Group are under the following Copyright:
© 2011 W3C® (MIT, ERCIM, Keio), All Rights Reserved. Document use rules apply.
For the entire publication on the W3C site the liability and trademark rules apply.
This document defines a set of APIs to represent streaming media, including audio and video, in JavaScript, to allow media to be sent over the network to another browser or device implementing the appropriate set of real-time protocols, and media received from another browser or device to be processed and displayed locally. This specification is being developed in conjunction with a protocol specification developed by the IETF RTCWEB group and an API specification to get access to local media devices developed by the Media Capture Task Force.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is not complete. It is subject to major changes and, while early experimentations are encouraged, it is therefore not intended for implementation. The API is based on preliminary work done in the WHATWG. The Web Real-Time Communications Working Group expects this specification to evolve significantly based on:
As the specification matures, the group hopes to strike the right balance between a low-level API that would enable interested parties to tweak potentially complex system parameters, and a more high-level API that Web developers can use without a priori technical knowledge about real-time communications.
This document was published by the Web Real-Time Communications Working Group as an Editor's Draft. If you wish to make comments regarding this document, please send them to public-webrtc@w3.org@w3.org (subscribe, archives). All feedback is welcome.
Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words must, must not, required, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC2119].
This section is non-normative.
There are a number of facets to video-conferencing in HTML covered by this specification:
video
or audio
elements.This document defines the APIs used for these features. This specification is being developed in conjunction with a protocol specification developed by the IETF RTCWEB group and an API specification to get access to local media devices developed by the Media Capture Task Force.
The
interface is used to
represent streams of media data, typically (but not necessarily) of audio and/or video
content, e.g. from a local camera or a remote site. The data from a MediaStream
object does not necessarily have a canonical
binary form; for example, it could just be "the video currently coming from the user's
video camera". This allows user agents to manipulate media streams in whatever fashion
is most suitable on the user's platform.MediaStream
Each
object can represent zero
or more tracks, in particular audio and video tracks. Tracks can contain multiple
channels of parallel data; for example a single audio track could have nine channels of
audio data to represent a 7.2 surround sound audio track.MediaStream
Each track represented by a
object has a corresponding MediaStream
object.MediaStreamTrack
A
object has an input and an
output. The input depends on how the object was created: a MediaStream
object generated by a
LocalMediaStream
getUserMedia()
call, for instance, might take
its input from the user's local camera, while a
created by a MediaStream
object will take as input the data received
from a remote peer. The output of the object controls how the object is used, e.g. what
is saved if the object is written to a file, what is displayed if the object is used in
a PeerConnection
video
element, or indeed what is transmitted to a remote peer if the
object is used with a
object.PeerConnection
Each track in a
object can be
disabled, meaning that it is muted in the object's output. All tracks are initially
enabled.MediaStream
A
can be finished, indicating that its inputs have
forever stopped providing data.MediaStream
The output of a
object must
correspond to the tracks in its input. Muted audio tracks must be replaced with
silence. Muted video tracks must be replaced with blackness.MediaStream
A new
object can be created from existing
MediaStream
objects using the MediaStreamTrack
MediaStream()
constructor. The constructor takes two
lists of
objects as arguments; one for audio tracks and
one for video tracks. The lists can either be the track lists of another stream, subsets of
such lists, or compositions of MediaStreamTrack
objects from different
MediaStreamTrack
objects.
MediaStream
The ability to duplicate a
, i.e.
create a new MediaStream
object from the track
lists of an existing stream, allows for greater control since separate MediaStream
instances can be manipulated and consumed
individually. This can be used, for instance, in a video-conferencing scenario to display
the local video from the user's camera and microphone in a local monitor, while only
transmitting the audio to the remote peer (e.g. in response to the user using a "video
mute" feature). Combining tracks from different MediaStream
objects into a new MediaStream
makes it
possible to, e.g., record selected tracks from a conversation involving several MediaStream
MediaStream
objects with a single
MediaStreamRecorder
.
The
interface is used
when the user agent is generating the stream's data (e.g. from a camera or streaming it
from a local video file).LocalMediaStream
When a
object is being
generated from a local file (as opposed to a live audio/video source), the user agent
should stream the data from the file in real time, not all at once. This reduces the
ease with which pages can distinguish live video from pre-recorded video, which can
help protect the user's privacy.LocalMediaStream
The MediaStream()
constructor takes two arguments. The arguments are two lists with
objects which will be used to construct the audio and video track lists of the new
MediaStreamTrack
object. When the constructor is invoked, the UA must run the
following steps:MediaStream
Let audioTracks be the constructor's first argument.
Let videoTracks be the constructor's second argument.
Let stream be a newly constructed
object.MediaStream
Set stream's label attribute to a newly generated value.
If audioTracks is not null, then run the following sub steps for each element track in audioTracks:
If track is of any other kind than "audio
", then throw
a SyntaxError
exception.
If track has the same underlying source as another element in stream's audio track list, then abort these steps.
Add track to stream's audio track list.
If videoTracks is not null, then run the following sub steps for each element track in videoTracks:
If track is of any other kind than "video
", then throw
a SyntaxError
exception.
If track has the same underlying source as another element in stream's video track list, then abort these steps.
Add track to stream's video track list.
A
can have multiple audio
and video sources (e.g. because the user has multiple microphones, or because the
real source of the stream is a media resource with
many media tracks). The stream represented by a MediaStream
thus has zero or more tracks.MediaStream
The tracks of a
are stored in two arrays; one for audio tracks
and one for video tracks. The two arrays must contain the MediaStream
objects
that correspond to the tracks of the stream. The relative order of all tracks in a user agent
must be stable. Tracks that come from a media resource whose format
defines an order must be in the order defined by the format; tracks that come from a media resource whose format does not define an order must be
in the relative order in which the tracks are declared in that media resource. Within these constraints, the order is
user-agent defined.MediaStreamTrack
A
object is said to be finished when all tracks belonging to the stream
have ended. When this
happens for any reason other than the MediaStream
stop()
method being invoked, the user agent
must queue a task that runs the following steps:
If the object's ended
attribute
has the value true already, then abort these steps. (The
stop()
method was probably called just
before the stream stopped for other reasons, e.g. the user clicked an in-page stop
button and then the user-agent-provided stop button.)
Set the object's ended
attribute
to true.
Fire a simple event named ended
at
the object.
If the end of the stream was reached due to a user request, the task source for this task is the user interaction task source. Otherwise the task source for this task is the networking task source.
[Constructor (MediaStreamTrackList? audioTracks, MediaStreamTrackList? videoTracks)]
interface MediaStream {
readonly attribute DOMString label;
readonly attribute MediaStreamTrackList
audioTracks;
readonly attribute MediaStreamTrackList
videoTracks;
MediaStreamRecorder
record ();
attribute boolean ended;
attribute Function? onended;
};
audioTracks
of type MediaStreamTrackList
, readonlyReturns a
object representing the audio tracks that can be enabled and disabled.MediaStreamTrackList
The audioTracks
attribute must return an
array host object for objects of type
that is fixed
length and read only. The same object must be returned each time the
attribute is accessed. [WEBIDL]MediaStreamTrack
ended
of type booleanThe
MediaStream.ended
attribute must return true if the
has finished, and false otherwise.MediaStream
When a
object is created, its MediaStream
ended
attribute must be set to false,
unless it is being created using the MediaStream()
constructor whose arguments are lists of
objects that are all ended,
in which case the MediaStreamTrack
object must be created with its
MediaStream
ended
attribute set to true.
label
of type DOMString, readonlyReturns a label that is unique to this stream, so that streams can be recognized
after they are sent through the PeerConnection
API.
When a LocalMediaStream
object is
created, the user agent must generate a globally unique identifier string, and must
initialize the object's label
attribute to that string. Such strings
must only use characters in the ranges U+0021, U+0023 to U+0027, U+002A to U+002B,
U+002D to U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005E to U+007E, and must be
36 characters long.
When a MediaStream
is created to
represent a stream obtained from a remote peer, the label
attribute
is initialized from information provided by the remote source.
When a
is created from
another using the MediaStream
MediaStream()
constructor, the label
attribute
is initialized to a newly generated value.
The label
attribute must return the value to
which it was initialized when the object was created.
The label of a
object is unique to the source of the stream, but that does not mean it is not
possible to end up with duplicates. For example, a locally
generated stream could be sent from one user to a remote peer using MediaStream
, and then sent back to the original
user in the same manner, in which case the original user will have multiple streams
with the same label (the locally-generated one and the one received from the remote
peer).PeerConnection
onended
of type Function, nullableended
, must be supported by all objects
implementing the MediaStream
interface.videoTracks
of type MediaStreamTrackList
, readonlyReturns a
object representing the video tracks that can be enabled and disabled.MediaStreamTrackList
The videoTracks
attribute must return an
array host object for objects of type
that is fixed
length and read only. The same object must be returned each time the
attribute is accessed. [WEBIDL]MediaStreamTrack
record
Begins recording the stream. The returned
object provides access to the
recorded data.MediaStreamRecorder
When the record()
method is invoked, the user
agent must return a new
object associated with the
stream.MediaStreamRecorder
MediaStreamRecorder
MediaStream
implements EventTarget;
All instances of the
type are defined to also implement the EventTarget interface.MediaStream
Before the web application can access the users media input devices it must let
getUserMedia()
create a
. Once the application is done using, e.g.,
a webcam and a microphone, it may revoke its own access by calling LocalMediaStream
stop()
on the
.
LocalMediaStream
A web application may, once it has access to a
,
use the LocalMediaStream
MediaStream()
constructor to construct
additional
objects. Since a derived MediaStream
object is created from the tracks of an existing stream, it cannot use any media
input devices that haven't been approved by the user.MediaStream
interface LocalMediaStream : MediaStream
{
void stop ();
};
stop
When a
object's
LocalMediaStream
stop()
method is invoked, the user agent
must queue a task that runs the following steps on every track:
Let track be the current
object.MediaStreamTrack
End track. The track start outputting only silence and/or blackness, as appropriate.
Dereference track's underlying media source.
If the reference count of track's underlying media source is greater than zero, then abort these steps.
Permanently stop the generation of data for track's source. If the data is being generated from a live source (e.g. a microphone or camera), then the user agent should remove any active "on-air" indicator for that source. If the data is being generated from a prerecorded source (e.g. a video file), any remaining content in the file is ignored.
The task source for the tasks
queued for the stop()
method is the DOM manipulation task
source.
void
A
object represents a media source in the user
agent. Several MediaStreamTrack
objects can represent the same media
source, e.g., when the user chooses the same camera in the UI shown by two consecutive calls
to MediaStreamTrack
getUserMedia()
.
A
object can reference its media source in two
ways, either with a strong or a weak reference, depending on how the track was created.
For example, a track in a MediaStreamTrack
, derived from a
MediaStream
with the LocalMediaStream
MediaStream()
constructor, has a weak reference to
a local media source, while a track in a
has a
strong reference. This means that a track in a LocalMediaStream
, derived
from a MediaStream
, will end if there is no non-ended track in a
LocalMediaStream
which references the same local media source.
A reference to a non-local media source as, e.g., an RTP source, is always strong.LocalMediaStream
The concept with strong and weak references to media sources allows the
web application to derive new
objects from
MediaStream
objects (created via LocalMediaStream
getUserMedia()
),
and still be able to revoke all given permissions with LocalMediaStream.stop()
.
A
object is said to end
when the user agent learns that no more data will ever be forthcoming for this
track.MediaStreamTrack
When a
object ends for any reason (e.g.
because the user rescinds the permission for the page to use the local camera,
or because the data comes from a finite file and the file's end has been reached
and the user has not requested that it be looped, or because the track belongs
to a MediaStreamTrack
that comes from a remote peer and the remote
peer has permanently stopped sending data, or because the UA has instructed the
track to end for any reason, or because the reference count of the track's underlying
media source has reached zero, it is said to be
ended.
When track instance track ends for any reason other than MediaStream
stop()
method being invoked on the
object that represents track,
the user agent must queue a task that runs the following steps:LocalMediaStream
If the track's readyState
attribute has the value ENDED
(2) already, then abort these steps.
Set track's readyState
attribute to ENDED
(2).
Fire a simple event named ended
at the object.
typedef MediaStreamTrack
[] MediaStreamTrackList;
MediaStreamTrack
type.interface MediaStreamTrack {
readonly attribute DOMString kind;
readonly attribute DOMString label;
attribute boolean enabled;
const unsigned short LIVE = 0;
const unsigned short MUTED = 1;
const unsigned short ENDED = 2;
readonly attribute unsigned short readyState;
attribute Function? onmute;
attribute Function? onunmute;
attribute Function? onended;
};
enabled
of type booleanThe MediaStreamTrack.enabled
attribute, on getting, must return the last value to which it was set. On setting,
it must be set to the new value, and then, if the
object is still associated with a
track, must enable the track if the new value is true, and disable it
otherwise.MediaStreamTrack
Thus, after a
is disassociated from its track,
its MediaStreamTrack
enabled
attribute still changes value
when set, it just doesn't do anything with that new value.
kind
of type DOMString, readonlyThe MediaStreamTrack.kind
attribute must
return the string "audio
" if the object's corresponding
track is or was an audio track, "video
" if the
corresponding track is or was a video track, and a user-agent defined string
otherwise.
label
of type DOMString, readonlyUser agents may label audio and video sources (e.g. "Internal microphone" or
"External USB Webcam"). The MediaStreamTrack.label
attribute
must return the label of the object's corresponding track, if any. If the
corresponding track has or had no label, the attribute must instead return the
empty string.
Thus the kind
and label
attributes do not change value, even if the
object is disassociated from its
corresponding track.MediaStreamTrack
onended
of type Function, nullableended
, must be supported by all objects
implementing the MediaStreamTrack
interface.onmute
of type Function, nullablemute
, must be supported by all objects
implementing the MediaStreamTrack
interface.onunmute
of type Function, nullableunmute
, must be supported by all objects
implementing the MediaStreamTrack
interface.readyState
of type unsigned short, readonlyThe readyState
attribute represents the
state of the track. It must return the value to which the user agent last set it
(as defined below). It can have the following values: LIVE, MUTED or
ENDED.
When a
object is created, its MediaStreamTrack
readyState
is either
LIVE
(0) or
MUTED
(1), depending on the state
of the track's underlying media source. For example, a track in a
, created with
LocalMediaStream
getUserMedia()
, must initially have its readyState
attribute set to LIVE
(1), while a track in a
, received with a MediaStream
, must have its PeerConnection
readyState
attribute set to
MUTED
(1) until media
data arrives.
ENDED
of type unsigned shortThe track has ended (the track's underlying media source is no longer providing data, and will never provide more data for this track).
For example, a video track in a
finishes if
the user unplugs the USB web camera that acts as the track's media source.LocalMediaStream
LIVE
of type unsigned shortThe track is active (the track's underlying media source is making a best-effort attempt to provide data in real time).
The output of a track in the LIVE
state can be switched on and off with the
enabled
attribute.
MUTED
of type unsigned shortThe track is muted (the track's underlying media source is temporarily unable to provide data).
For example, a track is muted on the B-side if the A-side disables the corresponding
in the MediaStreamTrack
that is
being sent. A MediaStream
in a MediaStreamTrack
may be muted if the user temporarily revokes the web application's permission
to use a media input device.LocalMediaStream
When the addstream event triggers on a
, all
PeerConnection
objects in the resulting MediaStreamTrack
are muted until media data can be read from the RTP source.MediaStream
interface MediaStreamRecorder {
voice getRecordedData (BlobCallback
? callback);
};
getRecordedData
Creates a Blob
of the recorded data, and invokes the provided
callback with that Blob
.
When the getRecordedData()
method is called, the user agent must run the following steps:
Let callback be the callback indicated by the method's first argument.
If callback is null, abort these steps.
Let data be the data that was streamed by the
object from which the
MediaStream
was created
since the creation of the MediaStreamRecorder
object.MediaStreamRecorder
Return, and run the remaining steps asynchronously.
Generate a file that containing data in a format
supported by the user agent for use in audio
and
video
elements.
Let blob be a Blob
object representing the
contents of the file generated in the previous step. [FILE-API]
Queue a task to invoke callback with blob as its argument.
The getRecordedData()
method can
be called multiple times on one
object; each time, it will
create a new file as if this was the first time the method was being called. In
particular, the method does not stop or reset the recording when the method is
called.MediaStreamRecorder
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
callback |
| ✔ | ✘ |
voice
[Callback, NoInterfaceObject]
interface BlobCallback {
void handleEvent (Blob blob);
};
partial interface URL {
static DOMString createObjectURL (MediaStream
stream);
};
createObjectURL
Mints a Blob URL to refer to the given
.MediaStream
When the createObjectURL()
method is called
with a
argument, the user agent
must return a unique Blob URL for the given MediaStream
. [FILE-API]MediaStream
For audio and video streams, the data exposed on that stream must be in a format
supported by the user agent for use in audio
and video
elements.
A Blob URL is the same as what the
File API specification calls a Blob URI, except that anything in the
definition of that feature that refers to File
and Blob
objects is hereby extended to also apply to
and MediaStream
objects.LocalMediaStream
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
stream |
| ✘ | ✘ |
static DOMString
This sample code exposes a button. When clicked, the button is disabled and the user is prompted to offer a stream. The user can cause the button to be re-enabled by providing a stream (e.g. giving the page access to the local camera) and then disabling the stream (e.g. revoking that access).
<input type="button" value="Start" onclick="start()" id="startBtn"> <script> var startBtn = document.getElementById('startBtn'); function start() { navigator.getUserMedia({audio:true, video:true}, gotStream); startBtn.disabled = true; } function gotStream(stream) { stream.onended = function () { startBtn.disabled = false; } } </script>
This example allows people to record a short audio message and upload it to the server. This example even shows rudimentary error handling.
<input type="button" value="⚫" onclick="msgRecord()" id="recBtn"> <input type="button" value="◼" onclick="msgStop()" id="stopBtn" disabled> <p id="status">To start recording, press the ⚫ button.</p> <script> var recBtn = document.getElementById('recBtn'); var stopBtn = document.getElementById('stopBtn'); function report(s) { document.getElementById('status').textContent = s; } function msgRecord() { report('Attempting to access microphone...'); navigator.getUserMedia({audio:true}, gotStream, noStream); recBtn.disabled = true; } var msgStream, msgStreamRecorder; function gotStream(stream) { report('Recording... To stop, press to ◼ button.'); msgStream = stream; msgStreamRecorder = stream.record(); stopBtn.disabled = false; stream.onended = function () { msgStop(); } } function msgStop() { report('Creating file...'); stopBtn.disabled = true; msgStream.onended = null; msgStream.stop(); msgStreamRecorder.getRecordedData(msgSave); } function msgSave(blob) { report('Uploading file...'); var x = new XMLHttpRequest(); x.open('POST', 'uploadMessage'); x.send(blob); x.onload = function () { report('Done! To record a new message, press the ⚫ button.'); recBtn.disabled = false; }; x.onerror = function () { report('Failed to upload message. To try recording a message again, press the ⚫ button.'); recBtn.disabled = false; }; } function noStream() { report('Could not obtain access to your microphone. To try again, press the ⚫ button.'); recBtn.disabled = false; } </script>
This example allows people to take photos of themselves from the local video camera.
<article> <style scoped> video { transform: scaleX(-1); } p { text-align: center; } </style> <h1>Snapshot Kiosk</h1> <section id="splash"> <p id="errorMessage">Loading...</p> </section> <section id="app" hidden> <p><video id="monitor" autoplay></video> <canvas id="photo"></canvas> <p><input type=button value="📷" onclick="snapshot()"> </section> <script> navigator.getUserMedia({video:true}, gotStream, noStream); var video = document.getElementById('monitor'); var canvas = document.getElementById('photo'); function gotStream(stream) { video.src = URL.createObjectURL(stream); video.onerror = function () { stream.stop(); }; stream.onended = noStream; video.onloadedmetadata = function () { canvas.width = video.videoWidth; canvas.height = video.videoHeight; document.getElementById('splash').hidden = true; document.getElementById('app').hidden = false; }; } function noStream() { document.getElementById('errorMessage').textContent = 'No camera available.'; } function snapshot() { canvas.getContext('2d').drawImage(video, 0, 0); } </script> </article>
A
allows two users to
communicate directly, browser-to-browser. Communications are coordinated via a
signaling channel provided by script in the page via the server, e.g. using
PeerConnection
XMLHttpRequest
.
Calling new
creates a PeerConnection
(configuration,
signalingCallback)
object.PeerConnection
The configuration string gives the address of a STUN or TURN server to use to establish the connection. [STUN] [TURN]
The allowed formats for this string are:
TYPE 203.0.113.2:3478
"Indicates a specific IP address and port for the server.
TYPE relay.example.net:3478
"Indicates a specific host and port for the server; the user agent will look up the IP address in DNS.
TYPE example.net
"Indicates a specific domain for the server; the user agent will look up the IP address and port in DNS.
The "TYPE
" is one of:
STUN
STUNS
TURN
TURNS
The signalingCallback argument is a method that will be invoked
when the user agent needs to send a message to the other host over the signaling
channel. When the callback is invoked, convey its first argument (a string) to the
other peer using whatever method is being used by the Web application to relay
signaling messages. (Messages returned from the other peer are provided back to the
user agent using the processSignalingMessage()
method.)
A
object has an associated
PeerConnection
PeerConnection
signaling
callback, a PeerConnection
ICE
Agent,
a PeerConnection
readiness state and an SDP Agent. These
are initialized when the object is created.
When the PeerConnection()
constructor is invoked, the
user agent must run the following steps. This algorithm has a synchronous
section (which is triggered as part of the event loop algorithm).
Steps in the synchronous section are marked with ⌛.
Let serverConfiguration be the constructor's first argument.
Let signalingCallback be the constructor's second argument.
Let connection be a newly created
object.PeerConnection
Create an ICE Agent and let connection's PeerConnection
ICE Agent be that ICE
Agent. [ICE]
If serverConfiguration contains a U+000A LINE FEED (LF) character or a U+000D CARRIAGE RETURN (CR) character (or both), remove all characters from serverConfiguration after the first such character.
Split serverConfiguration on spaces to obtain configuration components.
If configuration components has two or more components, and the first component is a case-sensitive match for one of the following strings:
STUN
"STUNS
"TURN
"TURNS
"...then run the following substeps:
Let server type be STUN if the first component of
configuration components is 'STUN
' or
'STUNS
', and TURN otherwise (the first component of
configuration components is "TURN
" or
"TURNS
").
Let secure be true if the first component of configuration components is "STUNS
" or
"TURNS
", and false otherwise.
Let host be the contents of the second component of configuration components up to the character before the first U+003A COLON character (:), if any, or the entire string otherwise.
Let port be the contents of the second component of configuration components from the character after the first U+003A COLON character (:) up to the end, if any, or the empty string otherwise.
Configure the PeerConnection
ICE Agent's STUN or
TURN server as follows:
If the given IP address, host name, domain name, or port are invalid, then the user agent must act as if no STUN or TURN server is configured.
Let the connection's PeerConnection
signaling
callback be signalingCallback.
Set connection's PeerConnection
readiness state
to NEW
(0).
Set connection's PeerConnection
ice state
to NEW
(0).
Set connection's PeerConnection
sdp state
to NEW
(0).
Let connection's localStreams
attribute be an empty
read-only
array.
[WEBIDL]MediaStream
Let connection's remoteStreams
attribute be an empty
read-only
array.
[WEBIDL]MediaStream
Return connection, but continue these steps asynchronously.
Await a stable state. The synchronous section consists of the remaining steps of this algorithm. (Steps in synchronous sections are marked with ⌛.)
⌛ If the ice state is set to NEW, it must queue a task to start gathering ICE address and set the ice state to ICEGATHERING.
⌛ Once the ICE address gathering is complete, if there are any streams in localStreams, the SDP Agent will send the initial the SDP offer. The initial SDP offer must contain both the ICE candidate information as well as the SDP to represent the media descriptions for all the streams in localStreams.
During the lifetime of the peerConnection object, the following procedures are followed:
If a local media stream has been added and an SDP offer needs to be sent, and the ICE state is not NEW or ICEGATHERING, and the SDP Agent state is NEW or SDPIDLE, then send and queue a task to send an SDP offer and change the SPD state to SDP Waiting.
If an SDP offer has been received, and the SDP state is NEW or SDPIDLE, pass the ICE candidates from the SDP offer to the ICE Agent and change it state to ICECHECKING. Construct an appropriate SDP answer, update the remote streams, queue a task to send the SDP offer, and set the SDPAgent state to SDPIDLE.
At the point the sdpState changes from NEW to some other state, the readyState changes to NEGOTIATING.
If the ICE Agent finds a candidates that froms a valid connection, the ICE state is changed to ICECONNECTED
If the ICE Agent finishes checking all candidates, if a connection has been found, the ice state is changed to ICECOMPLETED and if not connection has been found it is changed to ICEFAILED.
If the iceState is ICECONNECTED or ICECOMPLETED and the SDP stat is SDPIDLE, the readyState is set to ACTIVE.
If the iceState is ICEFAILED, a task is queued to calls the close method.
The close method will cause the system to wait until the sdpStat is SDPIDLE then it will send an SDP offer terminating all media and change the readyState to CLOSING as well as stop all ICE process and change the iceState to ICE_CLOSED. Once an SDP anser to this offer is received, the readyState will be changed to CLOSED.
User agents may negotiate any codec and any resolution, bitrate, or other quality
metric. User agents are encouraged to initially negotiate for the native resolution of
the stream. For streams that are then rendered (using a video
element),
user agents are encouraged to renegotiate for a resolution that matches the rendered
display size.
Starting with the native resolution means that if the Web application
notifies its peer of the native resolution as it starts sending data, and the peer
prepares its video
element accordingly, there will be no need for a
renegotiation once the stream is flowing.
All SDP media descriptions for streams represented by
objects must include a label attribute
("MediaStream
a=label:
") whose value is the value of the
object's MediaStream
label
attribute.
[SDP] [SDPLABEL]
PeerConnection
s must not generate any
candidates for media streams whose media descriptions do not have a label attribute
("a=label:
"). [ICE] [SDP] [SDPLABEL]
When a user agent starts receiving media for a component and a candidate was
provided for that component by a PeerConnection
, the user agent must follow these
steps:
Let connection be the
expecting this media.PeerConnection
If there is already a
object
for the media stream to which this component belongs, then associate the component
with that media stream and abort these steps. (Some media streams have multiple
components; this API does not expose the role of these individual components in
ICE.)MediaStream
Create a
object to represent
the media stream. Set its MediaStream
label
attribute to the value of the SDP Label
attribute for that component's media stream.
Queue a task to run the following substeps:
If the connection's PeerConnection
readiness
state is CLOSED
(3), abort these steps.
Add the newly created
object to the end of connection's MediaStream
remoteStreams
array.
Fire a stream event named addstream
with the newly created
object at the connection object.MediaStream
When a PeerConnection
finds that a stream
from the remote peer has been removed (its port has been set to zero in a media
description sent on the signaling channel), the user agent must follow these steps:
Let connection be the
associated with the stream being
removed.PeerConnection
Let stream be the
object that represents the media stream being
removed, if any. If there isn't one, then abort these steps.MediaStream
By definition, stream is now finished.
A task is thus queued to update stream and fire an event.
Queue a task to run the following substeps:
If the connection's PeerConnection
readiness
state is CLOSED
(3), abort these steps.
Remove stream from connection's
remoteStreams
array.
Fire a stream event named removestream
with stream at the connection object.
The task source for the tasks listed in this section is the networking task source.
To prevent network sniffing from allowing a fourth party to establish a connection to a peer using the information sent out-of-band to the other peer and thus spoofing the client, the configuration information should always be transmitted using an encrypted connection.
[Constructor (DOMString configuration, SignalingCallback signalingCallback)]
interface PeerConnection {
void processSignalingMessage (DOMString message);
const unsigned short NEW = 0;
const unsigned short NEGOTIATING = 1;
const unsigned short ACTIVE = 2;
const unsigned short CLOSING = 4;
const unsigned short CLOSED = 3;
readonly attribute unsigned short readyState;
const unsigned short ICE_GATHERING = 0x100;
const unsigned short ICE_WAITING = 0x200;
const unsigned short ICE_CHECKING = 0x300;
const unsigned short ICE_CONNECTED = 0x400;
const unsigned short ICE_COMPLETED = 0x500;
const unsigned short ICE_FAILED = 0x600;
const unsigned short ICE_CLOSED = 0x700;
readonly attribute unsigned short iceState;
const unsigned short SDP_IDLE = 0x1000;
const unsigned short SDP_WAITING = 0x2000;
const unsigned short SDP_GLARE = 0x3000;
readonly attribute unsigned short sdpState;
void addStream (MediaStream
stream, MediaStreamHints hints);
void removeStream (MediaStream
stream);
readonly attribute MediaStream
[] localStreams;
readonly attribute MediaStream
[] remoteStreams;
void close ();
attribute Function? onconnecting;
attribute Function? onopen;
attribute Function? onstatechange;
attribute Function? onaddstream;
attribute Function? onremovestream;
};
iceState
of type unsigned short, readonlyThe iceState
attribute
must return the state of the
ICE Agent
PeerConnection
PeerConnection
ICE
state, represented by a number from the following list:
PeerConnection
. NEW
(0)PeerConnection
. ICE_GATHERING
(0x100)PeerConnection
. ICE_WAITING
(0x200)PeerConnection
. ICE_CHECKING
(0x300)PeerConnection
. ICE_CONNECTED
(0x400)PeerConnection
. ICE_COMPLETED
(0x500)PeerConnection
. ICE_FAILED
(0x600)PeerConnection
. ICE_CLOSED
(0x700)localStreams
of type array of MediaStream
, readonlyReturns a live array containing the streams that the user agent is currently
attempting to transmit to the remote peer (those that were added with addStream()
).
Specifically, it must return the read-only
array that the attribute was set to when the
MediaStream
's constructor ran.PeerConnection
onaddstream
of type Function, nullableaddstream
, must be supported by all objects
implementing the PeerConnection
interface.onconnecting
of type Function, nullableconnecting
, must be supported by all
objects implementing the PeerConnection
interface.onopen
of type Function, nullableopen
, must be supported by all objects
implementing the PeerConnection
interface.onremovestream
of type Function, nullableremovestream
, must be supported by all
objects implementing the PeerConnection
interface.onstatechange
of type Function, nullableopen
, must be supported by all objects
implementing the PeerConnection
interface. It is called any time the readyState, iceState, or sdpState
changes. readyState
of type unsigned short, readonlyThe readyState
attribute
must return the
object's
PeerConnection
PeerConnection
readiness
state, represented by a number from the following list:
PeerConnection
. NEW
(0)PeerConnection
. NEGOTIATING
(1)PeerConnection
. ACTIVE
(2)PeerConnection
. CLOSING
(4)PeerConnection
object is terminating all media and is in the process of closing the
Ice Agent and SDP Agent. PeerConnection
. CLOSED
(3)remoteStreams
of type array of MediaStream
, readonlyReturns a live array containing the streams that the user agent is currently receiving from the remote peer.
Specifically, it must return the read-only
array that the attribute was set to when the
MediaStream
's constructor ran.PeerConnection
This array is updated when addstream
and removestream
events are fired.
sdpState
of type unsigned short, readonlyThe sdpState
attribute
must return the state of the PeerConnection
SDP Agent , represented by
a number from the following list:
PeerConnection
. NEW
(0)PeerConnection
. SDP_IDLE
(0x1000)PeerConnection
. SDP_WAITING
(0x2000)PeerConnection
. SDP_GLARE
(0x3000)addStream
Attempts to starting sending the given stream to the remote peer. The format for the MediaStreamHints objects is currently undefined by the specification.
When the other peer starts sending a stream in this manner, an addstream
event is fired at the
object.PeerConnection
When the addStream()
method is
invoked, the user agent must run the following steps:
Let stream be the method's first argument.
Let hints be the method's second argument.
If the
object's
PeerConnection
PeerConnection
readiness
state is CLOSED
(3), throw an
INVALID_STATE_ERR
exception.
If stream is already in the
object's PeerConnection
localStreams
object, then abort
these steps.
Add stream to the end of the
object's PeerConnection
localStreams
object.
Return from the method.
Parse the hints provided by the application and apply them to the MediaStream, if possible.
Have the
add a
media stream for stream the next time the user agent
provides a stable state. Any other
pending stream additions and removals must be processed at the same time.PeerConnection
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
stream |
| ✘ | ✘ | |
hints | MediaStreamHints | ✘ | ✘ |
void
close
When the close()
method is invoked,
the user agent must run the following steps:
If the
object's
PeerConnection
PeerConnection
readiness
state is CLOSED
(3), throw an
INVALID_STATE_ERR
exception.
Destroy the PeerConnection
ICE Agent, abruptly ending any active ICE processing and any active
streaming, and releasing any relevant resources (e.g. TURN permissions).
Set the object's PeerConnection
readiness
state to CLOSED
(3).
The localStreams
and remoteStreams
objects remain in the
state they were in when the object was closed.
void
processSignalingMessage
When a message is relayed from the remote peer over the signaling channel is
received by the Web application, pass it to the user agent by calling the
processSignalingMessage()
method.
The order of messages is important. Passing messages to the user agent in a different order than they were generated by the remote peer's user agent can prevent a successful connection from being established or degrade the connection's quality if one is established.
When the processSignalingMessage()
method is invoked, the user agent must
run the following steps:
Let message be the method's argument.
Let connection be the
object on which the method was
invoked.PeerConnection
If connection's PeerConnection
readiness
state is CLOSED
(3), throw an
INVALID_STATE_ERR
exception.
If the first four characters of message are not
"SDP
" followed by a U+000A LINE FEED (LF) character, then
abort these steps. (This indicates an error in the signaling channel
implementation. User agents may report such errors to their developer consoles
to aid debugging.)
Future extensions to the
interface might use other prefix
values to implement additional features.PeerConnection
Let sdp be the string consisting of all but the first four characters of message.
Pass the sdp to the PeerConnection
SDP Agent as a
subsequent offer or answer, to be interpreted as appropriate given the current
state of the SDP Agent. [ICE]
When a PeerConnection
ICE
Agent forms a connection to the the far side and enters the state
ICECONNECTED, the user
agent must queue a task that sets the
object's PeerConnection
PeerConnection
readiness state
to ACTIVE
(2) and then fires a simple event named open
at the
object.PeerConnection
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
message | DOMString | ✘ | ✘ |
void
removeStream
Stops sending the given stream to the remote peer.
When the other peer stops sending a stream in this manner, a removestream
event is fired at the
object.PeerConnection
When the removeStream()
method
is invoked, the user agent must run the following steps:
Let stream be the method's argument.
If the
object's
PeerConnection
PeerConnection
readiness
state is CLOSED
(3), throw an
INVALID_STATE_ERR
exception.
If stream is not in the
object's PeerConnection
localStreams
object, then abort
these steps.
Remove stream from the
object's PeerConnection
localStreams
object.
Return from the method.
Have the
remove the
media stream for stream the next time the user agent
provides a stable state. Any other
pending stream additions and removals must be processed at the same time.PeerConnection
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
stream |
| ✘ | ✘ |
void
ACTIVE
of type unsigned shortCLOSED
of type unsigned shortclose()
method has been invoked.CLOSING
of type unsigned shortclose()
method has been invoked.ICE_CHECKING
of type unsigned shortICE_CLOSED
of type unsigned shortICE_COMPLETED
of type unsigned shortICE_CONNECTED
of type unsigned shortICE_FAILED
of type unsigned shortICE_GATHERING
of type unsigned shortICE_WAITING
of type unsigned shortNEGOTIATING
of type unsigned shortNEW
of type unsigned shortSDP_GLARE
of type unsigned shortSDP_IDLE
of type unsigned shortSDP_WAITING
of type unsigned shortPeerConnection
implements EventTarget;
All instances of the
type are defined to also implement the EventTarget interface.PeerConnection
[Callback, NoInterfaceObject]
interface SignalingCallback {
void handleEvent (DOMString message, PeerConnection
source);
};
handleEvent
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
message | DOMString | ✘ | ✘ | |
source |
| ✘ | ✘ |
void
When two peers decide they are going to set up a connection to each other, they both go through these steps. The STUN/TURN server configuration describes a server they can use to get things like their public IP address or to set up NAT traversal. They also have to send data for the signaling channel to each other using the same out-of-band mechanism they used to establish that they were going to communicate in the first place.
// the first argument describes the STUN/TURN server configuration var local = new PeerConnection('TURNS example.net', sendSignalingChannel); local.signalingChannel(...); // if we have a message from the other side, pass it along here // (aLocalStream is some LocalMediaStream object) local.addStream(aLocalStream); // start sending video function sendSignalingChannel(message) { ... // send message to the other side via the signaling channel } function receiveSignalingChannel (message) { // call this whenever we get a message on the signaling channel local.signalingChannel(message); } local.onaddstream = function (event) { // (videoElement is some <video> element) videoElement.src = URL.createObjectURL(event.stream); };
Although progress is being made, there is currently not enough agreement on the data channel to write it up. This section will be filled in as rough consensus is reached.
A Window
object has a strong reference to any
objects created from the constructor whose
global object is that PeerConnection
Window
object.
The addstream
and removestream
events use the
interface:MediaStreamEvent
Firing a stream event
named e with a
stream means that an event
with the name e, which does not bubble (except where otherwise
stated) and is not cancelable (except where otherwise stated), and which uses the
MediaStream
interface with the
MediaStreamEvent
stream
attribute set to stream, must be created and dispatched at the given target.
[Constructor(DOMString type, optional MediaStreamEventInit eventInitDict)]
interface MediaStreamEvent : Event {
readonly attribute MediaStream
? stream;
};
dictionary MediaStreamEventInit : EventInit {
DOMstring MediaStream? stream;
};
stream
of type MediaStream
, readonly, nullableThe stream
attribute represents the
object associated with the
event.MediaStream
MediaStreamEventInit
Membersstream
of type DOMstring MediaStream, nullable-
This section is non-normative.
The following event fires on
objects:MediaStream
Event name | Interface | Fired when... |
---|---|---|
ended |
Event |
The finished as a result of all tracks
in the ending. |
The following event fires on
objects:MediaStreamTrack
Event name | Interface | Fired when... |
---|---|---|
muted |
Event |
The object's source is temporarily
unable to provide data. |
unmuted |
Event |
The object's source is live again
after having been temporarily unable to provide data. |
ended |
Event |
The object's source will no longer
provide any data, either because the user revoked the permissions, or because the
source device has been ejected, or because the remote peer stopped sending data,
or because the stop() method was invoked. |
The following events fire on
objects:PeerConnection
Event name | Interface | Fired when... |
---|---|---|
connecting |
Event |
The ICE Agent has begun negotiating with the peer. This can happen multiple
times during the lifetime of the object. |
open |
Event |
The ICE Agent has finished negotiating with the peer. |
message |
MessageEvent |
A data UDP media stream message was received. |
addstream |
|
A new stream has been added to the remoteStreams array. |
removestream |
|
A stream has been removed from the remoteStreams array. |
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
This format is used for encoding UDP packets transmitted by potentially hostile Web page content via a trusted user agent to a destination selected by a potentially hostile remote server. To prevent this mechanism from being abused for cross-protocol attacks, all the data in these packets is masked so as to appear to be random noise. The intent of this masking is to reduce the potential attack scenarios to those already possible previously.
However, this feature still allows random data to be sent to destinations that might not normally have been able to receive them, such as to hosts within the victim's intranet. If a service within such an intranet cannot handle receiving UDP packets containing random noise, it might be vulnerable to attack from this feature.
Fragment identifiers cannot be used with application/html-peer-connection-data
as URLs cannot be used to identify streams that use this format.
This section will be removed before publication.
Need a way to indicate the type of the SDP when passing SDP strings.
The editors wish to thank the Working Group chairs, Harald Alvestrand and Stefan Håkansson, for their support.
No informative references.