This document describes how to use SVG [[!SVG2]] content with standard streaming technologies such as file formats, streaming protocols or related JavaScript APIs and HTML 5 [[!HTML5]] elements. It defines the missing terms enabling the mapping of SVG concepts on streaming concepts for use in browsers or in standalone multimedia players, and provides guidelines for creating SVG content compatible with delivery using streaming technologies.

Introduction

With the specification of the [[!HTML5]] media elements (video, audio, source, track), general streaming technologies are now implemented natively within browsers. Streaming covers a wide range of technologies: from simple progressive download, where a single file is downloaded over HTTP and played back at the same time; to real time streaming techniques using packet-based delivery, for instance with the RTP protocol ([[RFC3550]]); or to more recent adaptive streaming technologies such as HTML 5 Media Source Extensions ([[MSE]]), MPEG Dynamic Adaptive Streaming over HTTP [[MPEGDASH]], or HTTP Live Streaming ([[HLS]]).

Animated graphics content, in particular authored using SVG, describe changes applied over time to an initial graphics representation. Typical use cases for animated graphics on the Web are: short animations when loading a larger resource; animations within a graphical user interface; animations in diagrams as demonstrated in many d3.js examples; long-running cartoons, potentially synchronized with an audio track, for instance converted from Adobe Flash content by Swiffy, PixelPlant or MP4Box; graphical annotations to be displayed synchronously on top of a video as demonstrated by Popcorn.js.

From a user perspective, some animations, especially non-interactive, long running, media synchronized ones can be viewed as a particular type of videos, not relying on the traditional pixel-based video representations but on vector graphics. Hence, for such animations, especially when synchronized with audio and video streams, authors may want to deliver them in a streaming manner using the same delivery formats and protocols as for pixel-based video content. However, because SVG animations are structured using XML and do not use the same coding techniques as pixel-based video animations, some care should be taken to author "streamable" SVG content. This document provides guidelines on how to author SVG graphics animations such that they can be streamed and treated like any other stream, i.e. random accessed or seeked; delivered in live or on-demand scenarios; adapted to bandwidth conditions, etc. Transposition of streaming terms for SVG content is provided and the use of some streaming formats or protocols for delivering SVG is described.

Use Cases

Possible use cases which could benefit from SVG being used with streaming technologies are given in this section. This list is by no means exhaustive but highlights some of the benefits of using SVG content with streaming technologies:

TODO: add an example of traces of pen capture (inkml)

Definitions and processing of SVG streams

This section provides a description of how streaming concepts can be applied to SVG. It also provides definitions of some terms enabling the mapping of SVG content onto streaming protocols and formats and finally provides a description of SVG streams by user agents.

Progressive download and rendering of graphics animations

SVG documents, like HTML pages, can be progressively loaded and rendered. A browser can start rendering an SVG document before the entire document has been loaded, before the end </svg> tag has been parsed or even before some other element end tag is parsed. A browser may decide to refresh the rendering of the SVG document upon parsing of a new chunk of data. As a consequence, an HTTP server may control the delivery of a single SVG file over HTTP and therefore may partially control the rendering of the SVG content by a browser. In these circumstances and to some extent, SVG content can be viewed as being delivered in a streaming manner, with some limitations.

Document references

For progressive rendering of SVG documents to correspond to the viewer's or author's expectations, the SVG document SHOULD be carefully processed with respect to document references. The following example illustrates a possible problem if forward-references are used during progressive download.

<svg ...>
  ...
  <use xlink:href="#MyObject" .../>
  ...
  <g id="MyObject">
    ...
  </g>
</svg>
	  

If the download of the SVG file is blocked (e.g. because delayed by the server) right after the use element, this use element will not render anything because the referenced graphics identified with the "MyObject" id has not yet been loaded. In an even worse case, as seen on badly authored web pages, if a script element is loaded but requires object which are not yet loaded, the JavaScript evaluation will fail and will not be called again in the future when the required objects are loaded.

In order for progressive rendering to produce the expected rendering, authors SHOULD follow the following first guideline:

SVG documents SHOULD NOT use forward references.

External references

Add a note describing how to handle references to resources not in the SVG document.

Controlled progressive rendering

The SVG specification does not recommend any algorithm or size of a chunk to be used by browsers to trigger a rendering refresh. Each browser MAY have different progressive rendering refresh rate. For instance, this refresh rate may be configurable in some browsers. So, if a server is controlling the delivery of an SVG document and is holding off some SVG data because it is not yet needed, it cannot know if the SVG data already sent is displayed at all by the browser.

The SVG specification does not indicate either when a refresh should not happen. An author cannot indicate that it is not meaningful to display the first child of a g element without the other ones. The externalResourceRequired attribute initially meant for that purpose is not implemented in browsers and is now deprecated in SVG 2.0.

Comparing this behavior with traditional progressive download of pixel-based video animations or streams, the SVG rendering is quite different. With traditional videos, the browser knows when it has fully received a frame and knows when it has to trigger a refresh using the presentation timestamp associated to that frame. Similar mechanisms can be used for SVG. This document describes how.

Temporal order and document order

For SVG animated graphics, a document can be viewed, without introducing any authoring restriction, as composed of elements Ei which have to be displayed at some time Ti and for a certain duration Di. In typical non-animated cases, all elements have to be displayed from the beginning to the end of the document. In animated cases, this is not true. Elements may be displayed one after the others. There might also be some overlap in the presentation of elements. Two elements may start at different times, end at different times but will be displayed together during some time interval. This is similar to how subtitle cues, for instance as defined by [[WEBVTT]], might overlap.

In order to keep a consistent rendering during the download (i.e. have the objects that should be displayed together really displayed together), the next authoring guideline, which is similar to the restriction applied to the layout of WebVTT cues in a WebVTT file, is:

The order of the elements in an SVG document SHOULD follow the time ordering, i.e. elements SHOULD be serialized according to their first rendering time.

Non rendered elements (e.g. defs elements) MAY be placed at any place in the document provided that Guideline 1 is respected. Ideally, they SHOULD be placed just before the elements referencing them.

		<defs>
		  <rect id="r" .../>
		</defs>
		<use xlink:href="#r">
		<defs>
		  <radialGradient id="rg" .../>
		</defs>
		<circle fill="url(#rg)">
	  

One potential problem with this guideline is when the rendering order according to the document order differs from the time order. This is for instance the case if object A needs to be rendered in front of B but is rendered before B. Authors MAY use the z-index property; a combination of use elements and animation elements to change the visual order between A and B; or scripting to change the document order between A and B at a given time.

		<defs>
		  <rect id="r" .../>
		  <circle id="c" .../>
		</defs>
		<g>
		  <set attributeName="display" to="none" begin="10">
		  <use xlink:href="#r">
		  <use xlink:href="#c">
		</g>
		<g display="none">
		  <set attributeName="display" to="inline" begin="10">
		  <use xlink:href="#c">
		  <use xlink:href="#r">
		</g>
	  
		  <rect z-index="1"...>
		    <set attributeName="z-index" to="2" begin="10">
		  </rect>
		  <circle z-index="2">
		    <set attributeName="z-index" to="1" begin="10">
		  </circle>
	  

Definitions

SVG Stream and SVG Access Unit

Following guidelines 1 and 2, the progressive rendering of an SVG document will be consistent during the download. Additionally, with the second guideline, an SVG document can be viewed as an SVG stream defined as follows.

An SVG stream is an SVG document physically divided into SVG Access Units.

An SVG Access Unit is a group of consecutive bytes, representing SVG content, containing entire XML tags, to which a timestamp greater than the previous Access Unit, corresponding to the first rendering time of the first element entirely contained in these bytes, can be assigned.

Delivering an SVG stream and rendering it as expected faces typical real-time streaming constraints: each SVG Access Unit SHALL be delivered and parsed by the browser before the rendering deadline given by its timestamp. It is up to the browser to request them on time or to the server to send them on time, depending on the streaming protocol. An SVG element MAY be parsed by the browser before its start time (Ti), if the SVG Access Unit it belongs to is delivered to the browser too early. This only has impact on buffer and memory occupancy, not directly on rendering. However, if it is delivered and parsed too late, the rendering result will not be correct.

The delivery of SVG streams using standard HTTP servers SHOULD respect these real-time constraints. However, storing SVG streams as SVG files, there is no easy way neither for the browser to know when to request a particular byte range of the SVG file nor for a simple HTTP server to know when to send it. An intelligent HTTP server could parse the SVG file to understand the delivery scheduling, but this is not an easy task. It is therefore desired to inform browsers and servers about the mapping between byte ranges in the SVG file and time ranges. In other words, browsers or servers need to know how the SVG stream is made, where the byte boundaries and what the timestamps of Access Units are. As for video streams, they need to know also some additional information about each Access Unit: whether it can be skipped or not without affecting subsequent AUs; whether it can be processed without having processed the previous one (i.e. if it is a Random Access Point). In traditional video streaming solutions, this signaling information is provided by delivery formats or protocols. This section describes how the same delivery formats and protocols can be used to provide signaling information and to stream SVG content.

SVG Stream Header

To enable efficient streaming, video content is typically stored in structured video file formats, such as [[WEBM]] or [[!ISOBMFF]]; or stored together with additional helper files, such as adaptive streaming manifests. Based on these files, the browser downloads first only a small amount of indexing information, typically the header of the structured file, to have sufficient knowledge to perform operations such as skipping frames, seeking, or seamless switching between streams of different qualities. In live streaming, the indexing information is refreshed regularly. Such indexing information in particular contains where the Random Access Points (RAP) are located in time and as byte offsets in the stream. To enable efficient streaming of SVG content and in particular live streaming, the same mechanisms are used. Storing SVG streams in structured files facilitates the indexing of SVG content and the identification in the structure of SVG documents of the Access Units and the Random Access Points.

A RAP may be defined as a point in a stream from which the processing can start without processing previous data but still leading to the same result as the one produced when the stream is read from the beginning, or from a previous RAP. In other words, RAPs are points where seek operations can be easily performed. Accessing a stream at a point P not identified as a RAP may require parsing the stream from the previous RAP (possibly the beginning of the SVG document) and fast forwarding to the desired point P. Additionally, RAPs should be transparent to the readers which started processing the stream from a position before the RAP. In other words, after reading a RAP, all readers should be in the same status.

In streaming formats, the notion of RAP is tightly coupled with the notion of stream header. Typically, RAP data at time Ti and at time Tj will contain different data. However, for some media formats, some static data (i.e. parser initialization data) is always required to start processing a stream. Such static data may be repeated in the stream at the beginning of each RAP. However, in some cases, because repeating this static data at every RAP position is costly or because the presence of this static data in the middle of a stream is not transparent for readers already initialized, such static data is sometimes placed into a stream header and delivered out-of-band, i.e. not in the same channel as the Access Unit data.

The following sections define the terms SVG stream header and SVG Random Access Point.

Creating a Random Access Point at an arbitrary point of an SVG stream means that the loading of the document from the beginning of the document or from the RAP SHOULD be "equivalent". Conforming SVG documents always start with an opening <svg> tag which, if repeated in the middle of an SVG document, may have a different meaning and may not be transparent to SVG parsers which have already parsed the first one. Therefore, to be able to seek in SVG streams, the definition of an SVG stream header is needed.

The SVG stream header is defined as the static data, i.e. valid for the whole stream, required to initialize an SVG parser and required for seek operations.

The SVG stream header SHALL contain at least the svg document start tag and possibly the XML encoding processing instruction or DOCTYPE. It MAY contain more. For instance, if graphical elements or styles are used throughout the stream, these elements MAY be placed into the stream header. Additionally, if scripting is used, script elements that define global functions or variables valid for the whole lifetime of the stream MAY also be placed into the stream header. Typically, elements in the SVG stream header SHOULD not have a rendering time different from their load time and SHOULD NOT contain animations. Indeed, SVG Parsers reading SVG streams with a header will, upon seeking at a RAP, parse the header before parsing the data in the RAP. So, if animated elements are placed in the header, every seek operation will restart these animations.

Authors SHOULD place as much elements as possible in the SVG stream header provided that they do not break seek operations.

SVG Random Access Points

An SVG Random Access Point is an SVG Access Unit which contains sufficient information for a browser after reading it, and without reading previous SVG Access Units, to be in the same state as if the SVG stream had been read from the beginning or from the previous SVG Random Access Point.

Obviously, the first Access Unit in an SVG stream is an SVG Random Access Point.

In SVG, the state of a browser when reading a document includes the DOM tree, the styles and the different JavaScript contexts. These contexts can be the global context or inner closure contexts. In many cases, even if the DOM tree is different or if some script context is different, if the rendered result on screen after reading a RAP is the same as reading the previous RAP and if it remains the same on, the point in the stream MAY still be considered by the author a RAP. But, if the state of the browser needs to be strictly the same, some active state management is REQUIRED. In particular, some form of garbage collection MAY be performed. Elements read before a RAP MAY need to be deleted from the DOM and JavaScript objects too. The SVG discard element MAY be used for that. Scoped style sheets MAY be also be used to scope the styles to an SVG Access Unit.

Authors SHOULD be aware that if some elements needed by all Random Access Points are placed into the stream header, these elements MAY also need to be discarded before Random Access Points that do not need them to achieve strict state management.

If strict browser state needs to be maintained at Random Access Point, authors SHOULD remove unnecessary objects as soon as possible and in particular before the next Random Access Points, using the discard element, JavaScript delete operations or any other similar mechanism.

One way to achieve follow Guideline 4 is to use self-discarding elements as follows:

<svg ...>
  ...
  <g>
    ...
    <animate attributeName="display" from="none" to="inline" begin="10s" dur="5s"/>
    <discard begin="15s"/>
  </g>
</svg>
	  

In this example, the g element contains objects which will be displayed between time 10 and time 15. Assuming that after time 15 they will not be necessary anymore, the DOM tree can be reduced by removing them (and the g element) with the discard element.

SVG Stream Processing

The processing of an SVG stream relies on the standard processing of SVG documents for most situations. Only some pre-parsing operations MAY be needed when a seek operation requires loading new data.

If the seek operation does not require loading new data (i.e. when setting the currentTime to a time in the past from which no data was discarded), standard SVG seek procedure is applied.

When seeking to position in time for which data is missing (typically seeking to a future time or to a time in the past for which data was discarded), the processing of an SVG stream is as follows:

  1. Fetch from the delivery format or protocol the SVG Access Unit that is a Random Access Point and whose time is smaller or equal to the seek time.
  2. Reset the SVG parser and renderer;
  3. Initialize the parser by providing it with the SVG stream header;
  4. Start providing the SVG RAP data and then providing, as fast as possible, the data from subsequent Access Units whose times are smaller or equal to the seek time, until no more Access Unit matches this test.
  5. Set the document time to the seek time.

Looping the playback of an SVG stream MAY be considered as an operation of seeking to the beginning of the document (ie. seek time is zero).

Authors need to provide an explicit duration to the SVG stream as part of the delivery format or protocol to enable looping. If this information is not provided, the SVG stream, in particular the last AU MAY be considered to have infinite duration.

This part will need to be updated to align with Web Animations, regarding the duration of the SVG stream and regarding when the timeline starts and the potential use of the timelineBegin attribute.

SVG Streams in HTML 5

Whether they are delivered as documents or streams, SVG animations MAY be controlled as typical video streams: played, paused, seeked, looped. Such controls MAY be applied using the [[!HTML5]] media elements. This section provides some recommendations on how to use SVG content with the [[!HTML5]] media elements. In such case and when needed, in particular for long-running animations, servers MAY store or deliver SVG documents as SVG streams using technologies described in the following sections, in order to facilitate operations such as seeking.

SVG in HTML video elements

Animated SVG content can be considered as being "ostensibly video data" and given the above definitions and the above processing description, an SVG stream can be processed by a browser similarly to a video stream and therefore MAY be referenced by the HTML 5 video or source elements.

<video ...>
  <source src="file.svg" type="image/svg+xml" .../>
  <source src="file.mp4" type="video/mp4" .../>
</video>
	  

The same restrictions are applied when referencing an SVG stream from a video or source element as for referencing an SVG document from an HTML img element, i.e. scripts and external references are not processed.

Should reference the SVG integration spec for precise restrictions. Cross-origin restrictions?
Different MIME types will probably be used when SVG streams are delivered as plain SVG or SVG embedded in some multimedia format (WebM, MP4). For instance: "video/mp4; codecs='svg '" ? To be clarified.

SVG in HTML 5 track elements

If an SVG stream is delivered in its own streaming channel in a streaming session comprising video or audio streams; if the SVG stream is stored as a track in a file containing also video, audio or subtitle tracks, or if an HTML 5 track element is used to reference to SVG content, the SVG stream SHALL be exposed as an HTML 5 TextTrack track as follows:

	<video ...>
	  <source src="file.mp4" type="video/mp4" .../>
	  <source src="file.webm" type="video/webm" .../>
	  <track src="file.svg" kind="captions" .../>
	</video>
		  
Needs update based on the latest changes in the TextTrack interface (mode ...).

Rendering SVG tracks

When rendering SVG content (stored or delivered as an SVG stream or as document) using the HTML track element, the viewport used to render the SVG content SHALL be the viewport used to render the video.

The dimensions of the video stream itself are not used, enabling multiple video streams with multiple resolutions to be used with a single resolution-independent SVG stream.

Multiple SVG streams MAY be specified as tracks. Selection of an SVG track by a web application can be done based on the label or any other mechanisms [possibly responsive images techniques or [[MPEGDASH]] manifest information or equivalent].

Storage and delivery formats and protocols for SVG streams

This section defines the mapping of SVG streams onto existing streaming delivery formats and protocols.

Storage in multimedia file formats

ISO Base Media File Format

This section describes normative aspects only when if the [[!ISOBMFF]] format is supported.
Introduction

ISO/IEC 14496-12 [[!ISOBMFF]] provides means to store timed text streams, such as [[WEBVTT]] or TTML ([[TTAF1-DFXP]]) streams. With the above definitions of SVG streams, SVG stream header, SVG Access Units and SVG Random Access Points, the ISO/IEC 14496-12 standard can be used to store SVG streams. This section specifies the details of this storage.

Track format

The sample data SHALL consist of an SVG Access Unit.

If the SVG content renders only text, the SimpleTextSampleEntry SHALL be used. The SVG stream header SHALL be stored in the configuration box.

If the SVG content renders text and graphics, the sample entry SHALL be set as follows:

  • If all SVG Access Units in the SVG stream are RAPs, the XMLSubtitleSampleEntry SHALL be used. The SVG stream header SHALL be empty and each Access Unit SHALL contain an entire SVG Document.
  • Otherwise, the TextSubtitleSampleEntry SHALL be used. In this case, AU MAY contain only SVG document fragments, processable by a progressive parser.
Timing

Times provided in the SVG Access Units SHOULD be compatible with times provided in the track sample, as mapped by an edit list if any, otherwise Access Unit data MAY be delivered to the SVG Parser too early or too late. Sample composition time SHALL not be used for these tracks.

Layout

As indicated in the [[ISOBMFF]], if the SVG track is rendered outside of a web context (for instance when rendered by a standalone viewer) the layout information provided in the track SHALL be used to set the SVG viewport.

WebM

TBD

SVG in WebVTT files

Need to be clarified (using or not the 'metadata' type, need for the escape mechanism, use of RAP only)
What about SVG streams in XHR2 Stream mode ?

Adaptive SVG streaming

Adaptive streaming technologies provide the ability for streaming clients to select between alternative streams based on bandwidth availability, screen size, language... and in some circumstances to seamlessly switch between those streams using notion of segment.

The same concepts can be applied to SVG streams with the following definitions. Following the definition of the Initialization Segment in [[MSE]], an SVG Initialization Segment is defined as being the SVG stream header. An SVG media segment is defined as being a sequence of SVG Access Units starting with a RAP.

Delivery using RTP

All other delivery assume reliable delivery. Should we specify anything for RTP (over unreliable UDP)? What about SVG as a WebRTC channel for sharing vector graphics white board ?

Acknowledgements