Initial Author of this Specification was Ian Hickson, Google Inc., with the following copyright statement:
© Copyright 2004-2011 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA. You are granted a license to use, reproduce and create derivative works of this document.
All subsequent changes since 26 July 2011 done by the W3C WebRTC Working Group are under the following Copyright:
© 2011 W3C® (MIT, ERCIM, Keio), All Rights Reserved. Document use rules apply.
For the entire publication on the W3C site the liability and trademark rules apply.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document was published by the Web Real-Time Communications Working Group as an Editor's Draft. If you wish to make comments regarding this document, please send them to public-webrtc@w3.org@w3.org (subscribe, archives). All feedback is welcome.
Publication as a Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This section is non-normative.
There are a number of facets to video-conferencing in HTML:
video
or
audio
elements.This document defines the APIs used for these features.
A voice chat feature in a game could attempt to get access to the user's microphone by calling the API as follows:
<script> navigator.getUserMedia('audio', gotAudio); function gotAudio(stream) { // ... use 'stream' ... } </script>
A video-conferencing system would ask for both audio and video:
<script> function beginCall() { navigator.getUserMedia('audio,video user', gotStream); } function gotStream(stream) { // ... use 'stream' ... } </script>
The MediaStream
interface is used to represent
streams of media data, typically (but not necessarily) of audio
and/or video content, e.g. from a local camera or a remote site. The
data from a MediaStream
object does not necessarily
have a canonical binary form; for example, it could just be "the
video currently coming from the user's video camera". This allows
user agents to manipulate media streams in whatever fashion is most
suitable on the user's platform.
Each MediaStream
object can represent zero or more
tracks, in particular audio and video tracks. Tracks can contain
multiple channels of parallel data; for example a single audio track
could have nine channels of audio data to represent a 7.2 surround
sound audio track.
Each track represented by a MediaStream
object has a
corresponding MediaStreamTrack
object.
A MediaStream
object has an input and an output. The
input depends on how the object was created: a
LocalMediaStream
object generated by a getUserMedia()
call, for
instance, might take its input from the user's local camera, while a
MediaStream
created by a PeerConnection
object will take as input the data received from a remote peer. The
output of the object controls how the object is used, e.g. what is
saved if the object is written to a file, what is displayed if the
object is used in a video
element, or indeed what is
transmitted to a remote peer if the object is used with a
PeerConnection
object.
Each track in a MediaStream
object can be disabled,
meaning that it is muted in the object's output. All tracks are
initially enabled.
A MediaStream
can be finished, indicating that its
inputs have forever stopped providing data. When a
MediaStream
object is finished, all its tracks are
muted regardless of whether they are enabled or disabled.
The output of a MediaStream
object must correspond
to the tracks in its input. Muted audio tracks must be replaced with
silence. Muted video tracks must be replaced with blackness.
A MediaStream
object's output can be "forked" by
creating a new MediaStream
object from it using the
MediaStream()
constructor. The
new MediaStream
object's input is the output of the
object from which it was created, with any disabled tracks removed,
and its output is therefore at most a subset of that "parent"
object. (Merely muted tracks are not removed, so the tracks do not
change when the parent is finished.) When such a fork's parent
finishes, the fork is also said to have finished.
This can be used, for instance, in a video-conferencing scenario to display the local video from the user's camera and microphone in a local monitor, while only transmitting the audio to the remote peer (e.g. in response to the user using a "video mute" feature).
When a track in a MediaStream
parent is disabled, any MediaStreamTrack
objects corresponding to the tracks in any MediaStream
objects that were created from parent are
disassociated from any track, and must not be reused for tracks
again. If a disabled track in a MediaStream
parent is re-enabled, from the perspective of any
MediaStream
objects that were created from parent it is a new track and thus new
MediaStreamTrack
objects must be created for the tracks
that correspond to the re-enabled track.
The LocalMediaStream
interface is used when the user
agent is generating the stream's data (e.g. from a camera or
streaming it from a local video file). It allows authors to control
individual tracks during the generation of the content, e.g. to
allow the user to temporarily disable a local camera during a
video-conference chat.
When a LocalMediaStream
object is being generated
from a local file (as opposed to a live audio/video source), the
user agent should stream the data from the file in real time, not
all at once. This reduces the ease with which pages can distinguish
live video from pre-recorded video, which can help protect the
user's privacy.
The MediaStream(parentStream)
constructor must return a
new MediaStream
object whose tracks at any moment in
time are the enabled tracks of parentStream at
that moment, and whose label
is equal to the parentStream's.
A MediaStream
object is said to end when the
user agent learns that no more data will ever be forthcoming for
this stream.
When a MediaStream
object ends for any reason (e.g.
because the user rescinds the permission for the page to use the
local camera, or because the data comes from a finite file and the
file's end has been reached and the user has not requested that it
be looped, or because the stream comes from a remote peer and the
remote peer has permanently stopped sending data, or because the
MediaStream
was created from another
MediaStream
and that stream has just itself ended), it
is said to be finished.
When this happens for any reason other than the stop()
method being invoked, the
user agent must queue a task that runs the following
steps:
If the object's readyState
attribute has the
value ENDED
(2) already, then
abort these steps. (The stop()
method was probably called just before the stream stopped for other
reasons, e.g. the user clicked an in-page stop button and then the
user-agent-provided stop button.)
Set the object's readyState
attribute to ENDED
(2).
Fire a simple event named ended
at the object.
As soon as a MediaStream
object is finished, the stream's tracks
start outputting only silence and/or blackness, as appropriate, as defined earlier.
If the end of the stream was reached due to a user request, the task source for this task is the user interaction task source. Otherwise the task source for this task is the networking task source.
[Constructor (in MediaStream parentStream)]
interface MediaStream {
readonly attribute DOMString label;
readonly attribute MediaStreamTrackList tracks;
MediaStreamRecorder
record ();
const unsigned short LIVE = 1;
const unsigned short ENDED = 2;
readonly attribute unsigned short readyState;
attribute Function? onended;
};
label
of type DOMString, readonlyPeerConnection
API.onended
of type Function, nullableended
, must be supported by all objects implementing the MediaStream
interface.readyState
of type unsigned short, readonlyThe readyState
attribute represents the state of the stream. It must return the
value to which the user agent last set it (as defined below). It can
have the following values: LIVE or ENDED.
When a MediaStream
object is created, its readyState
attribute must
be set to LIVE
(1), unless
it is being created using the MediaStream()
constructor whose
argument is a MediaStream
object whose readyState
attribute has
the value ENDED
(2), in
which case the MediaStream
object must be created with
its readyState
attribute set to ENDED
(2).
tracks
of type MediaStreamTrackList, readonlyReturns a MediaStreamTrackList
object representing
the tracks that can be enabled and disabled.
A MediaStream
can have multiple audio and video
sources (e.g. because the user has multiple microphones, or because
the real source of the stream is a media resource with
many media tracks). The stream represented by a
MediaStream
thus has zero or more tracks.
The tracks
attribute must return an array host
object for objects of type MediaStreamTrack
that is
fixed length and read only. The same object must be
returned each time the attribute is accessed. [WEBIDL]
The array must contain the MediaStreamTrack
objects that
correspond to the the tracks of the stream. The relative order of
all tracks in a user agent must be stable. All audio tracks must
precede all video tracks. Tracks that come from a media
resource whose format defines an order must be in the order
defined by the format; tracks that come from a media
resource whose format does not define an order must be in the
relative order in which the tracks are declared in that media
resource. Within these constraints, the order is user-agent
defined.
record
Begins recording the stream. The returned
MediaStreamRecorder
object provides access to the
recorded data.
When the record()
method is
invoked, the user agent must return a new
MediaStreamRecorder
object associated with the stream.
MediaStreamRecorder
ENDED
of type unsigned shortLIVE
of type unsigned shortMediaStream
implements EventTarget;
All instances of the
type are defined to also implement the EventTarget interface.MediaStream
interface LocalMediaStream : MediaStream
{
void stop ();
};
stop
When a LocalMediaStream
object's stop()
method is
invoked, the user agent must queue a task that runs the
following steps:
If the object's readyState
attribute is
in the ENDED
(2) state,
then abort these steps.
Permanently stop the generation of data for the stream. If the data is being generated from a live source (e.g. a microphone or camera), and no other stream is being generated from a live source, then the user agent should remove any active "on-air" indicator. If the data is being generated from a prerecorded source (e.g. a video file), any remaining content in the file is ignored. The stream is finished. The stream's tracks start outputting only silence and/or blackness, as appropriate, as defined earlier.
Set the object's readyState
attribute to
ENDED
(2).
Fire a simple event named ended
at the object.
The task source for the tasks queued for the stop()
method is the DOM
manipulation task source.
void
typedef MediaStreamTrack
[] MediaStreamTrackList;
MediaStreamTrack
type.interface MediaStreamTrack {
readonly attribute DOMString kind;
readonly attribute DOMString label;
attribute boolean enabled;
};
enabled
of type booleanThe MediaStreamTrack.enabled
attribute, on getting, must return the last value to which it was
set. On setting, it must be set to the new value, and then, if the
MediaStreamTrack
object is still associated with a track,
must enable the track if the new value is true, and disable it
otherwise.
Thus, after a MediaStreamTrack
is
disassociated from its track, its enabled
attribute still
changes value when set, it just doesn't do anything with that new
value.
kind
of type DOMString, readonlyThe MediaStreamTrack.kind
attribute must return the string "audio
" if
the object's corresponding track is or was an audio track, "video
" if the corresponding track is or was a video
track, and a user-agent defined string otherwise.
label
of type DOMString, readonlyWhen a LocalMediaStream
object is created, the user
agent must generate a globally unique identifier string, and must
initialize the object's label
attribute to that string. Such strings must only use characters in
the ranges U+0021, U+0023 to U+0027, U+002A to U+002B, U+002D to
U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005E to U+007E, and
must be 36 characters long.
When a MediaStream
is created to represent a stream
obtained from a remote peer, the label
attribute is initialized from
information provided by the remote source.
When a MediaStream
is created from another using the
MediaStream()
constructor, the
label
attribute is
initialized from the original.
The label
attribute must return the value to which it was initialized when the
object was created.
The label of a MediaStream
object is
unique to the source of the stream, but that does not mean it is not
possible to end up with duplicates. For example, when a
MediaStream
object is created from another using the
MediaStream()
constructor, the
fork has the same label as the original. Similarly, a locally
generated stream could be sent from one user to a remote peer using
PeerConnection
, and then sent back to the original user
in the same manner, in which case the original user will have
multiple streams with the same label (the locally-generated one and
the one received from the remote peer).
User agents may label audio and video sources (e.g. "Internal
microphone" or "External USB Webcam"). The MediaStreamTrack.label
attribute must return the label of the object's corresponding track,
if any. If the corresponding track has or had no label, the
attribute must instead return the empty string.
Thus the kind
and label
attributes do not change
value, even if the MediaStreamTrack
object is disassociated
from its corresponding track.
interface MediaStreamRecorder {
voice getRecordedData (in BlobCallback
? callback);
};
getRecordedData
Creates a Blob
of the recorded data, and invokes
the provided callback with that Blob
.
When the getRecordedData()
method is called, the user agent must run the following steps:
Let callback be the callback indicated by the method's first argument.
If callback is null, abort these steps.
Let data be the data that was streamed
by the MediaStream
object from which the
MediaStreamRecorder
was created since the creation of the
MediaStreamRecorder
object.
Return, and run the remaining steps asynchronously.
Generate a file that containing data in
a format supported by the user agent for use in audio
and video
elements.
Let blob be a Blob
object
representing the contents of the file generated in the previous
step. [FILE-API]
Queue a task to invoke callback with blob as its argument.
The getRecordedData()
method can be called multiple times on one
MediaStreamRecorder
object; each time, it will create a new
file as if this was the first time the method was being called. In
particular, the method does not stop or reset the recording when the
method is called.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
callback |
| ✔ | ✘ |
voice
[Callback=FunctionOnly, NoInterfaceObject]
interface BlobCallback {
void handleEvent (in Blob blob);
};
Note that the following is actually only a partial interface, but ReSpec does not yet support that.
interface URL {
static DOMString createObjectURL (in MediaStream
stream);
};
createObjectURL
Mints a Blob URL to refer to the given MediaStream
.
When the createObjectURL()
method is called with a MediaStream
argument, the user agent
must return a unique Blob URL for the given
MediaStream
. [FILE-API]
For audio and video streams, the data exposed on that stream must
be in a format supported by the user agent for use in
audio
and video
elements.
A Blob URL is the same as what the
File API specification calls a Blob URI, except that
anything in the definition of that feature that refers to
File
and Blob
objects is hereby extended
to also apply to MediaStream
and
LocalMediaStream
objects.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
stream |
| ✘ | ✘ |
static DOMString
This sample code exposes a button. When clicked, the button is disabled and the user is prompted to offer a stream. The user can cause the button to be re-enabled by providing a stream (e.g. giving the page access to the local camera) and then disabling the stream (e.g. revoking that access).
<input type="button" value="Start" onclick="start()" id="startBtn"> <script> var startBtn = document.getElementById('startBtn'); function start() { navigator.getUserMedia('audio,video', gotStream); startBtn.disabled = true; } function gotStream(stream) { stream.onended = function () { startBtn.disabled = false; } } </script>
This example allows people to record a short audio message and upload it to the server. This example even shows rudimentary error handling.
<input type="button" value="⚫" onclick="msgRecord()" id="recBtn"> <input type="button" value="◼" onclick="msgStop()" id="stopBtn" disabled> <p id="status">To start recording, press the ⚫ button.</p> <script> var recBtn = document.getElementById('recBtn'); var stopBtn = document.getElementById('stopBtn'); function report(s) { document.getElementById('status').textContent = s; } function msgRecord() { report('Attempting to access microphone...'); navigator.getUserMedia('audio', gotStream, noStream); recBtn.disabled = true; } var msgStream, msgStreamRecorder; function gotStream(stream) { report('Recording... To stop, press to ◼ button.'); msgStream = stream; msgStreamRecorder = stream.record(); stopBtn.disabled = false; stream.onended = function () { msgStop(); } } function msgStop() { report('Creating file...'); stopBtn.disabled = true; msgStream.onended = null; msgStream.stop(); msgStreamRecorder.getRecordedData(msgSave); } function msgSave(blob) { report('Uploading file...'); var x = new XMLHttpRequest(); x.open('POST', 'uploadMessage'); x.send(blob); x.onload = function () { report('Done! To record a new message, press the ⚫ button.'); recBtn.disabled = false; }; x.onerror = function () { report('Failed to upload message. To try recording a message again, press the ⚫ button.'); recBtn.disabled = false; }; } function noStream() { report('Could not obtain access to your microphone. To try again, press the ⚫ button.'); recBtn.disabled = false; } </script>
This example allows people to take photos of themselves from the local video camera.
<article> <style scoped> video { transform: scaleX(-1); } p { text-align: center; } </style> <h1>Snapshot Kiosk</h1> <section id="splash"> <p id="errorMessage">Loading...</p> </section> <section id="app" hidden> <p><video id="monitor" autoplay></video> <canvas id="photo"></canvas> <p><input type=button value="📷" onclick="snapshot()"> </section> <script> navigator.getUserMedia('video user', gotStream, noStream); var video = document.getElementById('monitor'); var canvas = document.getElementById('photo'); function gotStream(stream) { video.src = URL.getObjectURL(stream); video.onerror = function () { stream.stop(); }; stream.onended = noStream; video.onloadedmetadata = function () { canvas.width = video.videoWidth; canvas.height = video.videoHeight; document.getElementById('splash').hidden = true; document.getElementById('app').hidden = false; }; } function noStream() { document.getElementById('errorMessage').textContent = 'No camera available.'; } function snapshot() { canvas.getContext('2d').drawImage(video, 0, 0); } </script> </article>
A PeerConnection
allows two users to communicate
directly, browser-to-browser. Communications are coordinated via a
signaling channel provided by script in the page via the server,
e.g. using XMLHttpRequest
.
Calling "new PeerConnection
(configuration, signalingCallback)" creates a PeerConnection
object.
The configuration string gives the address of a STUN or TURN server to use to establish the connection. [STUN] [TURN]
The allowed formats for this string are:
TYPE 203.0.113.2:3478
"
Indicates a specific IP address and port for the server.
TYPE relay.example.net:3478
"
Indicates a specific host and port for the server; the user agent will look up the IP address in DNS.
TYPE example.net
"
Indicates a specific domain for the server; the user agent will look up the IP address and port in DNS.
The "TYPE
" is one of:
STUN
STUNS
TURN
TURNS
The signalingCallback argument is a method
that will be invoked when the user agent needs to send a message
to the other host over the signaling channel. When the callback is
invoked, convey its first argument (a string) to the other peer
using whatever method is being used by the Web application to
relay signaling messages. (Messages returned from the other peer
are provided back to the user agent using the processSignalingMessage()
method.)
A PeerConnection
object has an associated
PeerConnection
signaling callback, a
PeerConnection
ICE Agent, a
PeerConnection
data UDP media stream, a
PeerConnection
readiness state and an
ICE started flag. These are initialized when the object
is created.
When the PeerConnection()
constructor is invoked, the user agent must run the following steps.
This algorithm has a synchronous section (which is
triggered as part of the event loop algorithm). Steps
in the synchronous section are marked with
⌛.
Let serverConfiguration be the constructor's first argument.
Let signalingCallback be the constructor's second argument.
Let connection be a newly created
PeerConnection
object.
Create an ICE Agent and let connection's
PeerConnection
ICE Agent be that ICE
Agent. [ICE]
If serverConfiguration contains a U+000A LINE FEED (LF) character or a U+000D CARRIAGE RETURN (CR) character (or both), remove all characters from serverConfiguration after the first such character.
Split serverConfiguration on spaces to obtain configuration components.
If configuration components has two or more components, and the first component is a case-sensitive match for one of the following strings:
STUN
"
STUNS
"
TURN
"
TURNS
"
...then run the following substeps:
Let server type be STUN if the first
component of configuration components is
"STUN
" or "STUNS
",
and TURN otherwise (the first component of configuration components is "TURN
" or "TURNS
").
Let secure be true if the first
component of configuration components is
"STUNS
" or "TURNS
",
and false otherwise.
Let host be the contents of the second component of configuration components up to the character before the first U+003A COLON character (:), if any, or the entire string otherwise.
Let port be the contents of the second component of configuration components from the character after the first U+003A COLON character (:) up to the end, if any, or the empty string otherwise.
Configure the PeerConnection
ICE
Agent's STUN or TURN server as follows:
If the given IP address, host name, domain name, or port are invalid, then the user agent must act as if no STUN or TURN server is configured.
Let the connection's
PeerConnection
signaling callback be
signalingCallback.
Set connection's
PeerConnection
readiness state to NEW
(0).
Set connection's ICE started flag to false.
Let connection's
PeerConnection
data UDP media stream be a
new data UDP media stream.
Let connection's localStreams
attribute be an empty read-only MediaStream
array. [WEBIDL]
Let connection's remoteStreams
attribute be an empty read-only MediaStream
array. [WEBIDL]
Return connection, but continue these steps asynchronously.
Await a stable state. The synchronous section consists of the remaining steps of this algorithm. (Steps in synchronous sections are marked with ⌛.)
⌛ If connection's ICE
started flag is still false, start the
PeerConnection
ICE Agent and send the
initial offer. The initial offer must include a media description
for the PeerConnection
data UDP media
stream, marked as "sendrecv", and for all the streams in
localStreams
(marked as "sendonly"). [ICE] [SDPOFFERANSWER]
⌛ Let connection's ICE started flag be true.
⌛ If connection's
PeerConnection
readiness state is still
NEW
(0), then
queue a task that sets it to NEGOTIATING
(1) and
then fires a simple event
named connecting
at the
PeerConnection
object.
When a PeerConnection
ICE Agent is
required to send SDP offers or answers, the user agent must follow
these steps:
Let sdp be the SDP offer or answer to be sent. [SDPOFFERANSWER]
Let message be the concatenation of the
string "SDP
", a U+000A LINE FEED (LF)
character, and sdp, in that order.
Queue a task to invoke that
PeerConnection
ICE Agent's
PeerConnection
signaling callback with
message as its first argument and the
PeerConnection
as its second argument.
All streams represented by MediaStream
objects must be
marked as "sendonly" by the peer that initially adds the stream to
the session. The PeerConnection
API does not support
bidirectional ("sendrecv") audio or video media streams. [SDPOFFERANSWER]
User agents may negotiate any codec and any resolution, bitrate,
or other quality metric. User agents are encouraged to initially
negotiate for the native resolution of the stream. For streams that
are then rendered (using a video
element), user agents
are encouraged to renegotiate for a resolution that matches the
rendered display size.
Starting with the native resolution means that if
the Web application notifies its peer of the native resolution as it
starts sending data, and the peer prepares its video
element accordingly, there will be no need for a renegotiation once
the stream is flowing.
All SDP media descriptions for streams represented by
MediaStream
objects must include a label attribute ("a=label:
") whose value is the value of the
MediaStream
object's label
attribute. [SDP] [SDPLABEL]
PeerConnection
ICE Agents must not
generate any candidates for media streams whose media descriptions
do not have a label attribute ("a=label:
"). [ICE] [SDP] [SDPLABEL]
When a user agent starts receiving media for a component and a
candidate was provided for that component by a
PeerConnection
ICE Agent, the user agent
must follow these steps:
Let connection be the
PeerConnection
whose ICE Agent is expecting this
media.
If there is already a MediaStream
object for the
media stream to which this component belongs, then associate the
component with that media stream and abort these steps. (Some media
streams have multiple components; this API does not expose the
role of these individual components in ICE.)
Create a MediaStream
object to represent the
media stream. Set its label
attribute to the value
of the SDP Label attribute for that component's media
stream.
Queue a task to run the following substeps:
If the connection's
PeerConnection
readiness state is CLOSED
(3), abort these
steps.
Add the newly created MediaStream
object to the
end of connection's remoteStreams
array.
Fire a stream event named addstream
with the newly
created MediaStream
object at the connection object.
When a PeerConnection
ICE Agent finds
that a stream from the remote peer has been removed (its port has
been set to zero in a media description sent on the signaling
channel), the user agent must follow these steps:
Let connection be the
PeerConnection
whose PeerConnection
ICE Agent has determined that a stream is being removed.
Let stream be the MediaStream
object that represents the media stream being removed, if any. If
there isn't one, then abort these steps.
By definition, stream is now finished.
A task is thus queued to update stream and fire an event.
Queue a task to run the following substeps:
If the connection's
PeerConnection
readiness state is CLOSED
(3), abort these
steps.
Remove stream from connection's remoteStreams
array.
Fire a stream event named removestream
with stream at the connection
object.
The task source for the tasks listed in this section is the networking task source.
To prevent network sniffing from allowing a fourth party to establish a connection to a peer using the information sent out-of-band to the other peer and thus spoofing the client, the configuration information should always be transmitted using an encrypted connection.
[Constructor (in DOMString configuration, in SignalingCallback signalingCallback)]
interface PeerConnection {
void processSignalingMessage (in DOMString message);
const unsigned short NEW = 0;
const unsigned short NEGOTIATING = 1;
const unsigned short ACTIVE = 2;
const unsigned short CLOSED = 3;
readonly attribute unsigned short readyState;
void send (in DOMString text);
void addStream (in MediaStream
stream);
void removeStream (in MediaStream
stream);
readonly attribute MediaStream
[] localStreams;
readonly attribute MediaStream
[] remoteStreams;
void close ();
attribute Function? onconnecting;
attribute Function? onopen;
attribute Function? onmessage;
attribute Function? onaddstream;
attribute Function? onremovestream;
};
localStreams
of type array of MediaStream
, readonlyReturns a live array containing the streams that the user agent
is currently attempting to transmit to the remote peer (those that
were added with addStream()
).
Specifically, it must return the read-only MediaStream
array that the attribute was set to when the
PeerConnection
's constructor ran.
onaddstream
of type Function, nullableaddstream
, must be supported by
all objects implementing the PeerConnection
interface.onconnecting
of type Function, nullableconnecting
, must be supported by
all objects implementing the PeerConnection
interface.onmessage
of type Function, nullablemessage
, must be supported by
all objects implementing the PeerConnection
interface.onopen
of type Function, nullableopen
, must be supported by
all objects implementing the PeerConnection
interface.onremovestream
of type Function, nullableremovestream
, must be supported by
all objects implementing the PeerConnection
interface.readyState
of type unsigned short, readonlyThe readyState
attribute must return the PeerConnection
object's
PeerConnection
readiness state, represented by a number from the following list:
PeerConnection
. NEW
(0)PeerConnection
. NEGOTIATING
(1)PeerConnection
. ACTIVE
(2)PeerConnection
. CLOSED
(3)remoteStreams
of type array of MediaStream
, readonlyReturns a live array containing the streams that the user agent is currently receiving from the remote peer.
Specifically, it must return the read-only MediaStream
array that the attribute was set to when the
PeerConnection
's constructor ran.
This array is updated when addstream
and removestream
events are fired.
addStream
Attempts to starting sending the given stream to the remote peer.
When the other peer starts sending a stream in this manner, an
addstream
event is fired at the PeerConnection
object.
When the addStream()
method is invoked, the user agent
must run the following steps:
Let stream be the method's argument.
If the PeerConnection
object's
PeerConnection
readiness state is CLOSED
(3), throw an
INVALID_STATE_ERR
exception.
If stream is already in the
PeerConnection
object's localStreams
object,
then abort these steps.
Add stream to the end of the
PeerConnection
object's localStreams
object.
Return from the method.
If the PeerConnection
's ICE
started flag is false, then abort these steps.
Have the PeerConnection
's
PeerConnection
ICE Agent add a media
stream for stream the next time the user agent
provides a stable
state. Any other pending stream additions and removals must
be processed at the same time. [ICE]
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
stream |
| ✘ | ✘ |
void
close
When the close()
method is invoked, the user agent must
run the following steps:
If the PeerConnection
object's
PeerConnection
readiness state is CLOSED
(3), throw an
INVALID_STATE_ERR
exception.
Destroy the PeerConnection
ICE
Agent, abruptly ending any active ICE processing and any
active streaming, and releasing any relevant resources (e.g. TURN
permissions).
Set the object's PeerConnection
readiness
state to CLOSED
(3).
The localStreams
and
remoteStreams
objects remain in the state they were in when the object was
closed.
void
processSignalingMessage
When a message is relayed from the remote peer over the
signaling channel is received by the Web application, pass it to
the user agent by calling the processSignalingMessage()
method.
The order of messages is important. Passing messages to the user agent in a different order than they were generated by the remote peer's user agent can prevent a successful connection from being established or degrade the connection's quality if one is established.
When the processSignalingMessage()
method is invoked, the
user agent must run the following steps:
Let message be the method's argument.
Let connection be the
PeerConnection
object on which the method was
invoked.
If connection's
PeerConnection
readiness state is CLOSED
(3), throw an
INVALID_STATE_ERR
exception.
If the first four characters of message are
not "SDP
" followed by a U+000A LINE FEED
(LF) character, then abort these steps. (This indicates an error
in the signaling channel implementation. User agents may report
such errors to their developer consoles to aid debugging.)
Future extensions to the
PeerConnection
interface might use other prefix
values to implement additional features.
Let sdp be the string consisting of all but the first four characters of message.
If connection's ICE started
flag is true, then pass sdp to the
PeerConnection
ICE Agent as a subsequent
offer or answer, to be interpreted as appropriate given the current
state of the ICE Agent, and abort these steps. [ICE]
The ICE started flag is false. Start the
PeerConnection
ICE Agent and pass it
sdp as the initial offer from the other peer;
the ICE Agent will then (asynchronously) construct the initial
answer and transmit it as described above.
If there is a remotely-initiated data UDP media
stream in the initial offer, and it has an encryption key
advertised in its media description that is 16 bytes long, then
that is the PeerConnection
data UDP media
stream.
After the initial answer has been sent, the ICE Agent must add
all the streams in localStreams
to the
session, as described above. [ICE]
Let connection's ICE started flag be true.
Queue a task that sets connection's PeerConnection
readiness state to NEGOTIATING
(1) and
then fires a simple event
named connecting
at the
PeerConnection
object.
When a PeerConnection
ICE Agent
completes ICE processing (even if there are no active streams), the
user agent must queue a task that sets the
PeerConnection
object's
PeerConnection
readiness state to ACTIVE
(2) and then fires a simple event named open
at the
PeerConnection
object.
When a PeerConnection
ICE Agent
restarts ICE processing for any reason (e.g. because a peer is
adding or removing a stream), the user agent must queue a
task that sets the PeerConnection
object's
PeerConnection
readiness state to NEGOTIATING
(1) and
then fires a simple event
named connecting
at the
PeerConnection
object.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
message | DOMString | ✘ | ✘ |
void
removeStream
Steps sending the given stream to the remote peer.
When the other peer stops sending a stream in this manner, a
removestream
event is fired at the PeerConnection
object.
When the removeStream()
method is invoked, the user agent
must run the following steps:
Let stream be the method's argument.
If the PeerConnection
object's
PeerConnection
readiness state is CLOSED
(3), throw an
INVALID_STATE_ERR
exception.
If stream is not in the
PeerConnection
object's localStreams
object,
then abort these steps.
Remove stream from the
PeerConnection
object's localStreams
object.
Return from the method.
If the PeerConnection
's ICE
started flag is false, then abort these steps.
Have the PeerConnection
's
PeerConnection
ICE Agent remove the media
stream for stream the next time the user agent
provides a stable
state. Any other pending stream additions and removals must
be processed at the same time. [ICE]
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
stream |
| ✘ | ✘ |
void
send
Attempts to send the given text to the remote peer. This uses UDP, which is inherently unreliable; there is no guarantee that every message will be received.
When a message sent in this manner from the other peer is
received, a message
event is fired at the PeerConnection
object.
The maximum length of text is 504 bytes
after encoding the string as UTF-8; attempting to send a payload
greater than 504 bytes results in an
INVALID_ACCESS_ERR
exception.
When the send()
method is invoked, the
user agent must run the following steps:
Let message be the method's first argument.
If the PeerConnection
object's
PeerConnection
readiness state is CLOSED
(3), throw an
INVALID_STATE_ERR
exception.
Let data be message encoded as UTF-8. [UTF-8]
If data is longer than 504 bytes,
throw an INVALID_ACCESS_ERR
exception and abort these
steps.
If the PeerConnection
's
PeerConnection
data UDP media stream is
not an active data UDP media stream, abort these
steps. No message is sent.
If the user agent is rate-limiting packets sent using this API, and sending the data packet at this time would exceed the limit, then abort these steps. User agents may report this to the user, e.g. in a development console.
Transmit a data packet to a peer using the
PeerConnection
's PeerConnection
data UDP media stream with data as the
message.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
text | DOMString | ✘ | ✘ |
void
ACTIVE
of type unsigned shortCLOSED
of type unsigned shortclose()
method has been invoked.NEGOTIATING
of type unsigned shortNEW
of type unsigned shortPeerConnection
implements EventTarget;
All instances of the
type are defined to also implement the EventTarget interface.PeerConnection
[Callback=FunctionOnly, NoInterfaceObject]
interface SignalingCallback {
void handleEvent (in DOMString message, in PeerConnection
source);
};
handleEvent
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
message | DOMString | ✘ | ✘ | |
source |
| ✘ | ✘ |
void
When two peers decide they are going to set up a connection to each other, they both go through these steps. The STUN/TURN server configuration describes a server they can use to get things like their public IP address or to set up NAT traversal. They also have to send data for the signaling channel to each other using the same out-of-band mechanism they used to establish that they were going to communicate in the first place.
// the first argument describes the STUN/TURN server configuration var local = new PeerConnection('TURNS example.net', sendSignalingChannel); local.signalingChannel(...); // if we have a message from the other side, pass it along here // (aLocalStream is some LocalMediaStream object) local.addStream(aLocalStream); // start sending video function sendSignalingChannel(message) { ... // send message to the other side via the signaling channel } function receiveSignalingChannel (message) { // call this whenever we get a message on the signaling channel local.signalingChannel(message); } local.onaddstream = function (event) { // (videoElement is some <video> element) videoElement.src = URL.getObjectURL(event.stream); };
All PeerConnection
connections include a data
UDP media stream, which is used to send data packets
peer-to-peer, for instance game control packets. This data channel
is unreliable (packets are not guaranteed to be delivered), and
packets received out of order are discarded.
SDP media descriptions for data UDP media streams must use the "application
" media type, the "udp
" transport protocol, and the
"application/html-peer-connection-data
" media format
description. [SDP]
All SDP media descriptions for data UDP media streams must include a label attribute
("a=label:
") whose value is the string "data
". [SDP] [SDPLABEL]
All SDP media descriptions for data UDP media streams must also include a key field
("k=
"), with the value being a base64-encoded
representation of 16 cryptographically random bytes determined on a
per-ICE-Agent basis. [SDP]
PeerConnection
ICE Agents must attempt to
establish a connection for their PeerConnection
data UDP media stream with the initial offer/answer exchange,
and must maintain that UDP media stream for the ICE Agents' whole
lifetime.
Each PeerConnection
data UDP media
stream has a sending sequence number, which must
initially be set to one (1), and a most recently received
sequence number, much must initially be zero (0).
A data UDP media stream is an active data UDP
media stream if the PeerConnection
ICE
Agent has selected a destination for it. A data UDP
media stream can change active status many times during the
lifetime of its PeerConnection
object (e.g. any time
the network topology changes and the ICE Agent performs an ICE
Restart). [ICE]
Bytes transmitted on a data UDP media stream are masked so as to prevent cross-protocol attacks (data UDP media stream always appear to contain random noise to other protocols). For the purposes of masking, the data UDP media stream masking salt is defined to be the following 16 bytes, described here as hexadecimal numbers: DB 68 B5 FD 17 0E 15 77 56 AF 7A 3A 1A 57 75 02
Bytes transmitted on a data UDP media stream are also hashed so as to prevent forgery attacks (an attacker cannot change the data without knowing the key negotiated via the signaling channel). For the purposes of this hashing, the data UDP media stream hashing salt is defined to be the following 16 bytes, described here as hexadecimal numbers: 4E 2F 96 AB 0A 39 92 A2 56 94 91 F5 7E 58 2E FA
When the user agent is to transmit a data packet to a peer using a data UDP media stream and with a byte string payload raw message, the user agent must run the following steps:
Let nonce be 16 cryptographically random bytes.
Let ice-key be the 16 bytes given as the encryption key for the data UDP media stream in its media description, as defined above.
Let sending sequence number be the current sending sequence number.
Increment the sending sequence number by one (1).
Let mask-key be the first 16 bytes of the HMAC-SHA1 of the 16 data UDP media stream masking salt bytes keyed with the 16 ice-key bytes. [HMAC] [SHA1]
Let typed raw message be the concatenation of the sequence number as a big-endian 64 bit integer, three 0x00 bytes, a 0x01 byte, and raw message.
Let masked message be the result of encrypting typed raw message using AES-128-CTR keyed with mask-key and using the 16 nonce bytes as the initial counter value. [AES]
Let masked message with nonce be the concatenation of nonce and masked message.
Let hash-key be the first 16 bytes of the HMAC-SHA1 of the 16 data UDP media stream hashing salt bytes keyed with the 16 ice-key bytes. [HMAC] [SHA1]
Let hash be the first 16 bytes of the HMAC-SHA1 of masked message with nonce keyed with the 16 hash-key bytes. [HMAC] [SHA1]
Let hashed masked message with nonce be the concatenation of hash and masked message with nonce.
Send hashed masked message with nonce in
a UDP packet to the destination that the relevant
PeerConnection
ICE Agent has selected a
destination for the data UDP media stream.
When a packet that is part of a data UDP media stream is received, the user agent must run the following steps:
Let hashed masked message with nonce be the UDP packet's data.
If hashed masked message with nonce is shorter than 32 bytes, then abort these steps.
Let ice-key be the 16 bytes given as the encryption key for the data UDP media stream in the media description for this media stream. [SDP]
Let hash-key be the first 16 bytes of the HMAC-SHA1 of the 16 data UDP media stream hashing salt bytes keyed with the 16 ice-key bytes. [HMAC] [SHA1]
Let hash be the first 16 bytes of the hashed masked message with nonce.
Let masked message with nonce be all but the first 16 bytes of hashed masked message with nonce.
If hash does not equal the first 16 bytes of the HMAC-SHA1 of masked message with nonce keyed with the 16 hash-key bytes, abort these steps. [HMAC] [SHA1]
Let nonce be the first 16 bytes of the masked message with nonce.
Let masked message be all but the first 16 bytes of masked message with nonce.
Let mask-key be the first 16 bytes of the HMAC-SHA1 of the 16 data UDP media stream masking salt bytes keyed with the 16 ice-key bytes. [HMAC] [SHA1]
Let typed raw message be the result of decrypting masked message using AES-128-CTR keyed with mask-key and using the 16 nonce bytes as the initial counter value. [AES]
Let sequence number be the result of interpreting the first eight bytes of typed raw message as a 64 bit big-endian integer.
If sequence number is less than the most recently received sequence number then abort these steps.
Let the most recently received sequence number be sequence number.
If the ninth, tenth, eleventh, and twelfth bytes of typed raw message are not 0x00, 0x00, 0x00, and 0x01 respectively, then abort these steps.
Let raw message be the byte string consisting of all but the first twelve bytes of typed raw message.
Let message be raw message decoded as UTF-8, with error handling.
Create an event that uses the MessageEvent
interface, with the name message
, which does not bubble, is not
cancelable, has no default action, and has a data
attribute whose value is
message, and queue a task to
dispatch the event at the PeerConnection
object
responsible for this side of the data UDP media
stream.
Though described above as being computed for each
packet, the ice-key, hash-key, and mask-key values can
be precomputed as soon as the PeerConnection
ICE
Agent is started.
The format of a packet sent over a data UDP media stream, as generated and parsed by the algorithms above, is as follows. The total overhead per packet is thus 44 bytes, of which four are intended for future extensions.
/'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''. +--------------+ +---------------+ +-ENCRYPTED------------------------------------------------------------+ : | 16 byte hash | | 16 byte nonce | | [ 8 bytes of sequence number ] [ 4 bytes of frame type ] [ data... ] | : +--------------+ +---------------+ +----------------------------------------------------------------------+ : \...........................................................................................'
A remotely-initiated data UDP media stream is the
first "sendrecv" media stream in the initial offer whose media is
"application
", whose transport protocol is
"udp
", whose media format description is
"application/html-peer-connection-data
", and whose label
attribute ("a=label:
") has the value "data
".
The task source for this task is the networking task source.
The data UDP media stream packet format is designed to protect against several obvious attacks. The data is made to appear pseudo-random, so that it cannot be used in a cross-protocol attack, even if somehow the stream were to be directed at an unsuspecting remote host. The data is hashed in such a way that it cannot be modified in transit. That data is encrypted so that it cannot be read in transit.
These security mechanisms rely in part on a key that is negotiated over the signalling channel; as such, the security is only as strong as the security of the signaling channel. Authors are encouraged to use TLS to protect the signalling channel and the page(s) hosting the application, and are encouraged to secure the host used to relay the signalling channel.
To avoid network traffic congestion and other denial of service attacks based on traffic volume, user agents should apply rate-limiting to data UDP media streams.
A Window
object has a strong reference to
any PeerConnection
objects created from the constructor
whose global object is that Window
object.
The addstream
and
removestream
events
use the MediaStreamEvent
interface:
Firing a stream event named e with a MediaStream
stream means that an event with the name e, which does not bubble (except where otherwise
stated) and is not cancelable (except where otherwise stated), and
which uses the MediaStreamEvent
interface with the stream
attribute set to stream, must be created and dispatched at the given
target.
interface MediaStreamEvent : Event {
readonly attribute MediaStream
? stream;
void initMediaStreamEvent (in DOMString typeArg, in boolean canBubbleArg, in boolean cancelableArg, in MediaStream
? streamArg);
};
stream
of type MediaStream
, readonly, nullableThe stream
attribute represents the MediaStream
object associated with
the event.
initMediaStreamEvent
The initMediaStreamEvent()
method must initialize the event in a manner analogous to the
similarly-named method in the DOM Events interfaces. [DOM-LEVEL-3-EVENTS]
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
typeArg | DOMString | ✘ | ✘ | |
canBubbleArg | boolean | ✘ | ✘ | |
cancelableArg | boolean | ✘ | ✘ | |
streamArg |
| ✔ | ✘ |
void
This section is non-normative.
The following event fires on MediaStream
objects:
Event name | Interface | Fired when... |
---|---|---|
ended
| Event
| The MediaStream object will no longer stream any data, either because the user revoked the permissions, or because the source device has been ejected, or because the remote peer stopped sending data, or because the stop() method was invoked.
|
The following events fire on PeerConnection
objects:
Event name | Interface | Fired when... |
---|---|---|
connecting
| Event
| The ICE Agent has begun negotiating with the peer. This can happen multiple times during the lifetime of the PeerConnection object.
|
open
| Event
| The ICE Agent has finished negotiating with the peer. |
message
| MessageEvent
| A data UDP media stream message was received. |
addstream
| MediaStreamEvent
| A new stream has been added to the remoteStreams array.
|
removestream
| MediaStreamEvent
| A stream has been removed from the remoteStreams array.
|
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
This format is used for encoding UDP packets transmitted by potentially hostile Web page content via a trusted user agent to a destination selected by a potentially hostile remote server. To prevent this mechanism from being abused for cross-protocol attacks, all the data in these packets is masked so as to appear to be random noise. The intent of this masking is to reduce the potential attack scenarios to those already possible previously.
However, this feature still allows random data to be sent to destinations that might not normally have been able to receive them, such as to hosts within the victim's intranet. If a service within such an intranet cannot handle receiving UDP packets containing random noise, it might be vulnerable to attack from this feature.
Fragment identifiers cannot be used with
application/html-peer-connection-data
as URLs cannot be
used to identify streams that use this format.
The editors wish to thank the Working Group chairs, Harald Alvestrand and Stefan Håkansson, for their support.
No informative references.