The Media Object Specification

W3C Editor's Draft 10 May 2007

This version:
Latest Editor's draft:
Doug Schepers (Vectoreal) <doug.schepers@vectoreal.com>


This specification defines the Media Object interface, which provides information about various types of media. The Media Object interface does not provide any information about the DOM properties associated with a file, but rather exposes information encoded in the native file format and provides information about the source and state of the file.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is produced by the Web API WG (part of the Rich Web Clients Activity). This document has no formal standing within the W3C. Please send comments to public-webapi@w3.org, the public email list for issues related to Web APIs.

The patent policy for this document is the 5 February 2004 W3C Patent Policy. Patent disclosures relevant to this specification may be found on the Web API Working Group's patent disclosure page. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.

Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Table of contents

1. Introduction

An increasing amount of content on the Web is based on raster images, video, and audio. Concurrently, there is a strong common interest in supplying metadata for media content. Each piece of media has some kind of metadata, such as the height and width of a picture, the length of a video, or the artist and title of an audio file, but there is no standardized manner to access this information.

MediaObject is an interface that provides access to this information via script. Certain generalized information is exposed discretely, and other data can be accessed as a undifferentiate string that the author can parse.

The MediaObject interface is intended for use with non-textual media, but it may be suitable for use with certain mixed-media document formats such as PDF.

1.1. Definitions

This section is normative.

extrinsic metadata
Metadata that do not affect the rendering of the media (e.g. author, data created, etc.)
intrinsic metadata
Metadata that do affect the rendering of the media (e.g. width, height, duration, etc.)

1.2. Not in This Specification

This section is informative.

This specification does not mandate or proscribe the type of metadata that may be used in a file or format. The interfaces described in this specification do not provide information about the elements associated with a media file, which is available in interfaces described in other specification, such as DOM. This specification does not dictate which elements implement these interfaces for any given specification or technology, nor does it place any limitations on which media types are available for any given element.

1.3. Conformance

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. For purposes of readability, these terms are not necessarily used in a case-sensitive manner in this document.

Sometimes, for readability, conformance requirements are phrased as requirements on elements, attributes, methods, interfaces, properties or functions. In all cases these are conformance requirements on implementations. A conforming implementation of this specification meets all requirements identified by the use of these terms, within the scope of its language bindings.

2. MediaObject Interface

This section is normative.

@@ISSUE: should we allow some metadata to be read-write, or only read? write permission would be useful for tagging, which is getting very popular

@@ISSUE: are there any considerations for <canvas> that should be taken into account? can this help inform that spec?

@@ISSUE: SVG should specify some way of extracting height and width on the root element for this interface... for percentages, absolute values could be derived by the referencing element's dimensions and/or the innerWidth/Height

@@ISSUE: should we allow someone to create a new instance of a MediaObject (similar to "var myImage = new Image()")? could this be useful for preloading? or should we force them to create an element and access it from there? should this be up to the host language?

The MediaObject interface is a set of properties and methods available on Element nodes which contain reference to external or encoded media. This includes, but is is not limited to raster images, and video and audio files or streams.

This interface should be implemented on all elements which provide referenced or embedded non-textual media. It may be implemented on elements which provide text-based media where appropriate (for example, PDF or SVG). The manner in which data and metadata is associated with a particular file can vary, from a wrapper that encloses the format, to native format information, to externally-defined data embedded in the file. How this information is extracted from the media file must be decided by each implementation for each MIME Type, and should be informed by mandated methods for that format.

Not all properties of this interface are applicable to all media types (e.g. audio media does not have a width or height). Additionally, not all properties that are applicable to a given media type may be available on a particular file of that type (e.g. while the 'author' property is available on most media types, the device or application that created the file may not have assigned a value to this property). If a media file does not provide information about a property, whether it is applicable to that media type or not, the User Agent must return null when that property is accessed via this interface, and must not throw any exceptions. Finally, a media file that may not be expected to have a particular property may nonetheless have a value for it (for example, while GIF files are generally considered to be a static raster format, they can have a duration, and less intuitively, an audio file could possess a width and height for unknown reasons); since the availability for a given datum on a media file may not match User Agent expectations, a User Agent must present all data available for a given media file.

Depending on the file format, the metadata may be available even if the file is not fully loaded (for instance, the metadata information may be in the header of the file). A User Agent must expose metadata as soon as it is available, regardless of the file's state of completeness.

Certain metadata have intrinsic values related to the format of the media, such as height or duration; these data must be read only. A User Agent must not allow an author to modify intrinsic values for the media file. Other metadata reflect extrinsic properties for the media file, such as the title, the author, or any keywords. A User Agent may allow an author to modify extrinsic (mutable) values for the media file.

@@ISSUE: how discrete/specific should we get with exposing properties?

2.1. type

('audio' | 'video' | 'raster' | 'vector')

2.2. mimeType

2.3. width

natural width in pixels

2.4. height

natural height in pixels

2.5. duration

(for video/audio)

2.6. fileSize

Size of the media in bytes

2.7. sourceUri

2.8. metadata

The metadata property must provide access to any metadata available for a media file, regardless of format. Accessing this property must return a string which contains the available metadata. Depending on this format, this metadata string may be a flat text file, XML, RDF, or even a binary encoding; how this metadata string is parsed and used is left to the author.

Metadata may be natively supported, or a separate format attached to the file. With mixed metadata available, not all metadata may agree [how to resolve?]. A User Agent may let the author of a document modify the contents of a metadata via script, unless the protected property of a file is file has a value of true, in which case any attempt to change the metadata content should throw an NO_MODIFICATION_ALLOWED_ERR exception.

2.1.1. Authoring guidelines

The responsibility for verifying the metadata format and content rests with the author of the script, as does confirming the consistency of the content. As a best practice, the author must not modify metadata in a way that violates the rights of the original creator of the file; supplemental metadata is preferred to removal or alteration of original metadata. Note that not all User Agents will permit modification of extrinsic metadata, so the author should not rely on this facility.

2.1.2. Intrinsic Metadata

The following list is being considered for individual property names. We need to find the proper tradeoff between simplicity and ease of implementation, and ease and convenience of authoring.

2.9. name

2.10. title

2.11. author

2.12. createDate

date media was created in ISO-8601 format

2.13. modifyDate

date media was last modified in ISO-8601 format

2.14. vRes

vertical resolution in dpi (dots per inch)

2.15. hRes

horizontal resolution in dpi (dots per inch)

2.16. bitDepth

2.17. channels

2.18. frameCount

2.19. audioBitRate

2.20. audioSampleRate

2.21. audioSampleSize

2.22. videoFrameRate

(frames/second) [avi]

2.23. videoDataRate

videoDataRate? [wmf]

2.24. videoSampleSize

2.25. completed

@@ISSUE: supply streaming info?

2.26. compression

2.27. protected

boolean value indicating whether or not media has rights or license restrictions

2.28. rights

2.29. license

3. Examples of usage

This section is informative.

This section illustrates several ECMAScript [ECMA262] examples using MediaObject.

3.1. Example of Acessing Metadata

The following code takes an element as a parameter, and returns the metadata associated with that element:

function getMetadata( el ) {


4. Relationship to other specifications

This section is informative.

4.1. DOM Level 3 Core

This is a supplementary specification to DOM3-Core. It may be implemented natively in other languages, such as HTML or SVG.

4.1. DCO

This is meant to coordinate with Delivery Context Ontology.

5. Security Considerations

This section is informative.

There are no known security considerations involved in the implementation or use of the MediaObject interface. This section shall be revised if future security considerations are discovered.

A. IDL Definitions

IDL Definition
interface MediaObject
   readonly attribute DOMString      type;
   readonly attribute DOMString      mimeType;
   readonly attribute unsigned long  width;
   readonly attribute unsigned long  height;
   readonly attribute unsigned long  duration;
   readonly attribute unsigned long  fileSize;
   readonly attribute DOMString      sourceUri;

            attribute DOMString      metadata;

   readonly attribute DOMString      name;
   readonly attribute DOMString      title;
   readonly attribute DOMString      author;
   readonly attribute DOMString      createDate;
   readonly attribute DOMString      modifyDate;

   readonly attribute unsigned long  vRes;
   readonly attribute unsigned long  hRes;
   readonly attribute unsigned long  bitDepth;
   readonly attribute unsigned short channels;
   readonly attribute unsigned long  frameCount;
   readonly attribute unsigned long  audioBitRate;
   readonly attribute unsigned long  audioSampleRate;
   readonly attribute unsigned long  audioSampleSize;
   readonly attribute unsigned long  videoFrameRate;
   readonly attribute unsigned long  videoDataRate;
   readonly attribute unsigned long  videoSampleSize;

   readonly attribute boolean        completed;

   readonly attribute boolean        compression;
   readonly attribute DOMString      protected;
   readonly attribute DOMString      rights;
   readonly attribute DOMString      license; 
No defined constants
Returns the type property of the object as a string. null if this element has no type properties.
Returns the mimeType property of the object as a string. null if this element has no mimeType properties.
Returns the width property of the object as a string. null if this element has no width properties.
Returns the height property of the object as a string. null if this element has no height properties.
Returns the duration property of the object as a string. null if this element has no duration properties.
Returns the fileSize property of the object as a string. null if this element has no fileSize properties.
Returns the sourceUri property of the object as a string. null if this element has no sourceUri properties.
Returns the metadata property of the object as a string. null if this element has no metadata properties.
Returns the name property of the object as a string. null if this element has no name properties.
Returns the title property of the object as a string. null if this element has no title properties.
Returns the author property of the object as a string. null if this element has no author properties.
Returns the createDate property of the object as a string. null if this element has no createDate properties.
Returns the modifyDate property of the object as a string. null if this element has no modifyDate properties.
Returns the vRes property of the object as a string. null if this element has no vRes properties.
Returns the hRes property of the object as a string. null if this element has no hRes properties.
Returns the bitDepth property of the object as a string. null if this element has no bitDepth properties.
Returns the channels property of the object as a string. null if this element has no channels properties.
Returns the frameCount property of the object as a string. null if this element has no frameCount properties.
Returns the audioBitRate property of the object as a string. null if this element has no audioBitRate properties.
Returns the audioSampleRate property of the object as a string. null if this element has no audioSampleRate properties.
Returns the audioSampleSize property of the object as a string. null if this element has no audioSampleSize properties.
Returns the videoFrameRate property of the object as a string. null if this element has no videoFrameRate properties.
Returns the videoDataRate property of the object as a string. null if this element has no videoDataRate properties.
Returns the videoSampleSize property of the object as a string. null if this element has no videoSampleSize properties.
Returns the completed property of the object as a string. null if this element has no completed properties.
Returns the compression property of the object as a string. null if this element has no compression properties.
Returns the protected property of the object as a string. null if this element has no protected properties.
Returns the rights property of the object as a string. null if this element has no rights properties.
Returns the license property of the object as a string. null if this element has no license properties.
No defined methods

B. ECMAScript Language Binding

C. Java Language Binding



D. Change History


First draft.

Various editorial changes and corrections and modifications to the examples are made from draft to draft.

E. References

Normative references

[RFC2119] Key words for use in RFCs to indicate Requirement Levels
S Bradner, 1997. The specification for how to use english to specify normativity, as if it were a technical language. Available at http://rfc.net/rfc2119.html

Informative references

[DIG35] Digital Imaging Group Metadata Specification
Metadata for digital images in XML format. Available at http://www.i3a.org/i_dig35.html
[Exif] Exchangeable image file format
Metadata for digital images. Available at http://it.jeita.or.jp/document/publica/standard/exif/english/jeida49e.htm and http://www.digicamsoft.com/exif22/exif22/html/exif22_1.htm (unofficial)
[ID3] "IDentify an MP3" Audio Metadata
Metadata for MP3 audio files. Available at http://www.id3.org/
audio (MP3)
[IPTC] International Press Telecommunications Council metadata format
Metadata for digital images. Available at http://www.iptc.org/IPTC4XMP/
[MPEG-7] Multimedia Content Description Interface
Metadata for video and multimedia. Available at http://www.iso.ch/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=34232&ICS1=35&ICS2=40&ICS3=
[SMIL-M] SMIL Metadata Module
Generic media metadata in RDF format. Available at http://www.w3.org/TR/2000/WD-smil-boston-20000622/metadata.html
[Xiph] Xiph Comments
Metadata for Ogg Vorbis compressed audio format. Available at http://xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment
[XMP] Extensible Metadata Platform
Metadata for digital images and PDF, in XML format. Available at http://partners.adobe.com/public/developer/en/xmp/sdk/XMPspecification.pdf
[DCO] Delivery Context Ontology
Device metadata. Available at http://www.w3.org/2007/uwa/editors-drafts/DeliveryContextOntology/2007-05-31/DCOntology.html

F. Acknowledgments

The editor would like to thank the SVG WG for vital feedback on this specification.