This specification is available in the following formats: single page HTML, multipage HTML, full specification. This is revision 1.5592.
Copyright © 2011 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
The bulk of the text of this specification is also available in the WHATWG Web Applications 1.0 specification, under a license that permits reuse of the specification text.
This document is a strict subset of the full HTML5 specification that omits user-agent (UA) implementation details. It is targeted toward Web authors and others who are not UA implementors and who want a view of the HTML specification that focuses more precisely on details relevant to using the HTML language to create Web documents and Web applications. Because this document does not provide implementation conformance criteria, UA implementors should not rely on it, but should instead refer to the full HTML5 specification.
This document is an automated redaction of the full HTML5 specification. As such, the two documents are supposed to agree on normative matters concerning Web authors. However, if the documents disagree, this is a bug in the redaction process and the unredacted full HTML specification takes precedence. Readers are encouraged to report such discrepancies as bugs in the bug tracking system of the HTML Working Group.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the most recently formally published revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
If you wish to make comments regarding this document in a manner that is tracked by the W3C, please submit them via using our public bug database. If you do not have an account then you can enter feedback using this form:
If you cannot do this then you can also e-mail feedback to public-html-comments@w3.org (subscribe, archives), and arrangements will be made to transpose the comments to our public bug database. Alternatively, you can e-mail feedback to whatwg@whatwg.org (subscribe, archives). The editor guarantees that all substantive feedback sent to this list will receive a reply. However, such feedback is not considered formal feedback for the W3C process. All feedback is welcome.
The working groups maintains a list of all bug reports that the editor has not yet tried to address and a list of issues for which the chairs have not yet declared a decision. The editor also maintains a list of all e-mails that he has not yet tried to address. These bugs, issues, and e-mails apply to multiple HTML-related specifications, not just this one.
Implementors should be aware that this specification is not stable. Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways. Vendors interested in implementing this specification before it eventually reaches the Candidate Recommendation stage should join the aforementioned mailing lists and take part in the discussions.
The publication of this document by the W3C as a W3C Working Draft does not imply that all of the participants in the W3C HTML working group endorse the contents of the specification. Indeed, for any section of the specification, one can usually find many members of the working group or of the W3C as a whole who object strongly to the current text, the existence of the section at all, or the idea that the working group should even spend time discussing the concept of that section.
The latest stable version of the editor's draft of this specification is always available on the W3C CVS server and in the WHATWG Subversion repository. The latest editor's working copy (which may contain unfinished text in the process of being prepared) contains the latest draft text of this specification (amongst others). For more details, please see the WHATWG FAQ.
There are various ways to follow the change history for the HTML specifications:
svn checkout http://svn.whatwg.org/webapps/The W3C HTML Working Group is the W3C working group responsible for this specification's progress along the W3C Recommendation track. This specification is the 16 February 2012 Editor's Draft.
Work on this specification is also done at the WHATWG. The W3C HTML working group actively pursues convergence with the WHATWG, as required by the W3C HTML working group charter.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
id attributetitle attributelang and xml:lang attributestranslate attributexml:base
attribute (XML only)dir attributeclass attributestyle attributedata-* attributesa elementem elementstrong elementsmall elements elementcite elementq elementdfn elementabbr elementtime elementcode elementvar elementsamp elementkbd elementsub and sup elementsi elementb elementu elementmark elementruby elementrt elementrp elementbdi elementbdo elementspan elementbr elementwbr elementimg element
iframe elementembed elementobject elementparam elementvideo elementaudio elementsource elementtrack elementcanvas elementmap elementarea elementform elementfieldset elementlegend elementlabel elementinput element
type attribute
type=text) state and Search state (type=search)type=tel)type=url)type=email)type=password)type=datetime)type=date)type=month)type=week)type=time)type=datetime-local)type=number)type=range)type=color)type=checkbox)type=radio)type=file)type=submit)type=image)type=reset)type=button)input element attributes
autocomplete attributedirname attributelist attributereadonly attributesize attributerequired attributemultiple attributemaxlength attributepattern attributemin and max attributesstep attributeplaceholder attributeinput element APIsbutton elementselect elementdatalist elementoptgroup elementoption elementtextarea elementkeygen elementoutput elementprogress elementmeter elementa and area elementsalternate"author"bookmark"help"icon"license"nofollow"noreferrer"prefetch"search"stylesheet"tag"The World Wide Web's markup language has always been HTML. HTML was primarily designed as a language for semantically describing scientific documents, although its general design and adaptations over the years have enabled it to be used to describe a number of other types of documents.
The main area that has not been adequately addressed by HTML is a vague subject referred to as Web Applications. This specification attempts to rectify this, while at the same time updating the HTML specifications to address issues raised in the past few years.
This specification is intended for authors of documents and scripts that use the features defined in this specification.
This document is probably not suited to readers who do not already have at least a passing familiarity with Web technologies, as in places it sacrifices clarity for precision, and brevity for completeness. More approachable tutorials and authoring guides can provide a gentler introduction to the topic.
In particular, familiarity with the basics of DOM Core and DOM Events is necessary for a complete understanding of some of the more technical parts of this specification. An understanding of Web IDL, HTTP, XML, Unicode, character encodings, JavaScript, and CSS will also be helpful in places but is not essential.
This specification is limited to providing a semantic-level markup language and associated semantic-level scripting APIs for authoring accessible pages on the Web ranging from static documents to dynamic applications.
The scope of this specification does not include providing mechanisms for media-specific customization of presentation (although default rendering rules for Web browsers are included at the end of this specification, and several mechanisms for hooking into CSS are provided as part of the language).
The scope of this specification is not to describe an entire operating system. In particular, hardware configuration software, image manipulation tools, and applications that users would be expected to use with high-end workstations on a daily basis are out of scope. In terms of applications, this specification is targeted specifically at applications that would be expected to be used by users on an occasional basis, or regularly but from disparate locations, with low CPU requirements. For instance online purchasing systems, searching systems, games (especially multiplayer online games), public telephone books or address books, communications software (e-mail clients, instant messaging clients, discussion software), document editing software, etc.
For its first five years (1990-1995), HTML went through a number of revisions and experienced a number of extensions, primarily hosted first at CERN, and then at the IETF.
With the creation of the W3C, HTML's development changed venue again. A first abortive attempt at extending HTML in 1995 known as HTML 3.0 then made way to a more pragmatic approach known as HTML 3.2, which was completed in 1997. HTML4 quickly followed later that same year.
The following year, the W3C membership decided to stop evolving HTML and instead begin work on an XML-based equivalent, called XHTML. This effort started with a reformulation of HTML4 in XML, known as XHTML 1.0, which added no new features except the new serialization, and which was completed in 2000. After XHTML 1.0, the W3C's focus turned to making it easier for other working groups to extend XHTML, under the banner of XHTML Modularization. In parallel with this, the W3C also worked on a new language that was not compatible with the earlier HTML and XHTML languages, calling it XHTML2.
Around the time that HTML's evolution was stopped in 1998, parts of the API for HTML developed by browser vendors were specified and published under the name DOM Level 1 (in 1998) and DOM Level 2 Core and DOM Level 2 HTML (starting in 2000 and culminating in 2003). These efforts then petered out, with some DOM Level 3 specifications published in 2004 but the working group being closed before all the Level 3 drafts were completed.
In 2003, the publication of XForms, a technology which was positioned as the next generation of Web forms, sparked a renewed interest in evolving HTML itself, rather than finding replacements for it. This interest was borne from the realization that XML's deployment as a Web technology was limited to entirely new technologies (like RSS and later Atom), rather than as a replacement for existing deployed technologies (like HTML).
A proof of concept to show that it was possible to extend HTML4's forms to provide many of the features that XForms 1.0 introduced, without requiring browsers to implement rendering engines that were incompatible with existing HTML Web pages, was the first result of this renewed interest. At this early stage, while the draft was already publicly available, and input was already being solicited from all sources, the specification was only under Opera Software's copyright.
The idea that HTML's evolution should be reopened was tested at a W3C workshop in 2004, where some of the principles that underlie the HTML5 work (described below), as well as the aforementioned early draft proposal covering just forms-related features, were presented to the W3C jointly by Mozilla and Opera. The proposal was rejected on the grounds that the proposal conflicted with the previously chosen direction for the Web's evolution; the W3C staff and membership voted to continue developing XML-based replacements instead.
Shortly thereafter, Apple, Mozilla, and Opera jointly announced their intent to continue working on the effort under the umbrella of a new venue called the WHATWG. A public mailing list was created, and the draft was moved to the WHATWG site. The copyright was subsequently amended to be jointly owned by all three vendors, and to allow reuse of the specification.
The WHATWG was based on several core principles, in particular that technologies need to be backwards compatible, that specifications and implementations need to match even if this means changing the specification rather than the implementations, and that specifications need to be detailed enough that implementations can achieve complete interoperability without reverse-engineering each other.
The latter requirement in particular required that the scope of the HTML5 specification include what had previously been specified in three separate documents: HTML4, XHTML1, and DOM2 HTML. It also meant including significantly more detail than had previously been considered the norm.
In 2006, the W3C indicated an interest to participate in the development of HTML5 after all, and in 2007 formed a working group chartered to work with the WHATWG on the development of the HTML5 specification. Apple, Mozilla, and Opera allowed the W3C to publish the specification under the W3C copyright, while keeping a version with the less restrictive license on the WHATWG site.
Since then, both groups have been working together.
The HTML specification published by the WHATWG is not identical to this specification. The main differences are that the WHATWG version includes features not included in this W3C version: some features have been omitted as they are considered part of future revisions of HTML, not HTML5; and other features are omitted because at the W3C they are published as separate specifications. There are also some minor differences. For an exact list of differences, please see the WHATWG specification.
A separate document has been published by the W3C HTML working group to document the differences between the HTML specified in this document and the language described in the HTML4 specification. [HTMLDIFF]
It must be admitted that many aspects of HTML appear at first glance to be nonsensical and inconsistent.
HTML, its supporting DOM APIs, as well as many of its supporting technologies, have been developed over a period of several decades by a wide array of people with different priorities who, in many cases, did not know of each other's existence.
Features have thus arisen from many sources, and have not always been designed in especially consistent ways. Furthermore, because of the unique characteristics of the Web, implementation bugs have often become de-facto, and now de-jure, standards, as content is often unintentionally written in ways that rely on them before they can be fixed.
Despite all this, efforts have been made to adhere to certain design goals. These are described in the next few subsections.
To avoid exposing Web authors to the complexities of multithreading, the HTML and DOM APIs are designed such that no script can ever detect the simultaneous execution of other scripts. Even with workers, the intent is that the behavior of implementations can be thought of as completely serializing the execution of all scripts in all browsing contexts.
The navigator.yieldForStorageUpdates()
method, in this model, is equivalent to allowing other scripts to
run while the calling script is blocked.
This specification interacts with and relies on a wide variety of other specifications. In certain circumstances, unfortunately, conflicting needs have led to this specification violating the requirements of these other specifications. Whenever this has occurred, the transgressions have each been noted as a "willful violation", and the reason for the violation has been noted.
This specification defines an abstract language for describing documents and applications, and some APIs for interacting with in-memory representations of resources that use this language.
The in-memory representation is known as "DOM HTML", or "the DOM" for short.
There are various concrete syntaxes that can be used to transmit resources that use this abstract language, two of which are defined in this specification.
The first such concrete syntax is the HTML syntax. This is the
format suggested for most authors. It is compatible with most legacy
Web browsers. If a document is transmitted with the
text/html MIME type, then it will be
processed as an HTML document by Web browsers.
This specification defines version 5 of the HTML syntax, known as
"HTML5".
The second concrete syntax is the XHTML syntax, which is an
application of XML. When a document is transmitted with an XML
MIME type, such as application/xhtml+xml, then
it is treated as an XML document by Web browsers, to be parsed by an
XML processor. Authors are reminded that the processing for XML and
HTML differs; in particular, even minor syntax errors will prevent a
document labeled as XML from being rendered fully, whereas they
would be ignored in the HTML syntax.
This specification defines version 5 of the XHTML syntax, known as
"XHTML5".
The DOM, the HTML syntax, and the XHTML syntax cannot all
represent the same content. For example, namespaces cannot be
represented using the HTML syntax, but they are supported in the DOM
and in the XHTML syntax. Similarly, documents that use the
noscript feature can be represented using the HTML
syntax, but cannot be represented with the DOM or in the XHTML
syntax. Comments that contain the string "-->" can only be represented in the DOM, not in
the HTML and XHTML syntaxes.
This specification is divided into the following major sections:
There are also some appendices, defining rendering rules for Web browsers and listing obsolete features and IANA considerations.
This specification should be read like all other specifications. First, it should be read cover-to-cover, multiple times. Then, it should be read backwards at least once. Then it should be read by picking random sections from the contents list and following all the cross-references.
As described in the conformance requirements section below, this specification describes conformance criteria for a variety of conformance classes. In particular, there are conformance requirements that apply to producers, for example authors and the documents they create, and there are conformance requirements that apply to consumers, for example Web browsers. They can be distinguished by what they are requiring: a requirement on a producer states what is allowed, while a requirement on a consumer states how software is to act.
For example, "the foo attribute's value
must be a valid integer" is a requirement on
producers, as it lays out the allowed values; in contrast, the
requirement "the foo attribute's value must
be parsed using the rules for parsing integers" is a
requirement on consumers, as it describes how to process the
content.
Requirements on producers have no bearing whatsoever on consumers.
Continuing the above example, a requirement stating that a particular attribute's value is constrained to being a valid integer emphatically does not imply anything about the requirements on consumers. It might be that the consumers are in fact required to treat the attribute as an opaque string, completely unaffected by whether the value conforms to the requirements or not. It might be (as in the previous example) that the consumers are required to parse the value using specific rules that define how invalid (non-numeric in this case) values are to be processed.
This is a definition, requirement, or explanation.
This is a note.
This is an example.
This is an open issue.
This is a warning.
interface Example {
// this is an IDL definition
};method( [ optionalArgument ] )This is a note to authors describing the usage of an interface.
/* this is a CSS fragment */
The defining instance of a term is marked up like this. Uses of that term are marked up like this or like this.
The defining instance of an element, attribute, or API is marked
up like this. References to
that element, attribute, or API are marked up like this.
Other code fragments are marked up like
this.
Variables are marked up like this.
A basic HTML document looks like this:
<!DOCTYPE html> <html> <head> <title>Sample page</title> </head> <body> <h1>Sample page</h1> <p>This is a <a href="demo.html">simple</a> sample.</p> <!-- this is a comment --> </body> </html>
HTML documents consist of a tree of elements and text. Each
element is denoted in the source by a start tag, such as "<body>", and an end
tag, such as "</body>". (Certain
start tags and end tags can in certain cases be omitted and are implied by other
tags.)
Tags have to be nested such that elements are all completely within each other, without overlapping:
<p>This is <em>very <strong>wrong</em>!</strong></p>
<p>This <em>is <strong>correct</strong>.</em></p>
This specification defines a set of elements that can be used in HTML, along with rules about the ways in which the elements can be nested.
Elements can have attributes, which control how the elements
work. In the example below, there is a hyperlink,
formed using the a element and its href attribute:
<a href="demo.html">simple</a>
Attributes are placed
inside the start tag, and consist of a name and a value, separated by an "=" character. The attribute value can remain unquoted if it doesn't contain space characters or any of " ' `
= < or >. Otherwise, it has to be quoted using either
single or double quotes. The value, along with the "=" character, can be omitted altogether if the value
is the empty string.
<!-- empty attributes --> <input name=address disabled> <input name=address disabled=""> <!-- attributes with a value --> <input name=address maxlength=200> <input name=address maxlength='200'> <input name=address maxlength="200">
HTML user agents (e.g. Web browsers) then parse this markup, turning it into a DOM (Document Object Model) tree. A DOM tree is an in-memory representation of a document.
DOM trees contain several kinds of nodes, in particular a
DocumentType node, Element nodes,
Text nodes, Comment nodes, and in some
cases ProcessingInstruction nodes.
The markup snippet at the top of this section would be turned into the following DOM tree:
htmlhtmlThe root element of this tree is the
html element, which is the element always found at the
root of HTML documents. It contains two elements, head
and body, as well as a Text node between
them.
There are many more Text nodes in the DOM tree than
one would initially expect, because the source contains a number of
spaces (represented here by "␣") and line breaks ("⏎")
that all end up as Text nodes in the DOM. However, for
historical reasons not all of the spaces and line breaks in the
original markup appear in the DOM. In particular, all the whitespace
before head start tag ends up being dropped silently,
and all the whitespace after the body end tag ends up
placed at the end of the body.
The head element contains a title
element, which itself contains a Text node with the
text "Sample page". Similarly, the body element
contains an h1 element, a p element, and a
comment.
This DOM tree can be manipulated from scripts in the
page. Scripts (typically in JavaScript) are small programs that can
be embedded using the script element or using
event handler content attributes. For example, here is
a form with a script that sets the value of the form's
output element to say "Hello World":
<form name="main"> Result: <output name="result"></output> <script> document.forms.main.elements.result.value = 'Hello World'; </script> </form>
Each element in the DOM tree is represented by an object, and
these objects have APIs so that they can be manipulated. For
instance, a link (e.g. the a element in the tree above)
can have its "href"
attribute changed in several ways:
var a = document.links[0]; // obtain the first link in the document
a.href = 'sample.html'; // change the destination URL of the link
a.protocol = 'https'; // change just the scheme part of the URL
a.setAttribute('href', 'http://example.com/'); // change the content attribute directlySince DOM trees are used as the way to represent HTML documents when they are processed and presented by implementations (especially interactive implementations like Web browsers), this specification is mostly phrased in terms of DOM trees, instead of the markup described above.
HTML documents represent a media-independent description of interactive content. HTML documents might be rendered to a screen, or through a speech synthesizer, or on a braille display. To influence exactly how such rendering takes place, authors can use a styling language such as CSS.
In the following example, the page has been made yellow-on-blue using CSS.
<!DOCTYPE html>
<html>
<head>
<title>Sample styled page</title>
<style>
body { background: navy; color: yellow; }
</style>
</head>
<body>
<h1>Sample styled page</h1>
<p>This page is just a demo.</p>
</body>
</html>For more details on how to use HTML, authors are encouraged to consult tutorials and guides. Some of the examples included in this specification might also be of use, but the novice author is cautioned that this specification, by necessity, defines the language with a level of detail that might be difficult to understand at first.
When HTML is used to create interactive sites, care needs to be taken to avoid introducing vulnerabilities through which attackers can compromise the integrity of the site itself or of the site's users.
A comprehensive study of this matter is beyond the scope of this document, and authors are strongly encouraged to study the matter in more detail. However, this section attempts to provide a quick introduction to some common pitfalls in HTML application development.
The security model of the Web is based on the concept of "origins", and correspondingly many of the potential attacks on the Web involve cross-origin actions. [ORIGIN]
When accepting untrusted input, e.g. user-generated content such as text comments, values in URL parameters, messages from third-party sites, etc, it is imperative that the data be validated before use, and properly escaped when displayed. Failing to do this can allow a hostile user to perform a variety of attacks, ranging from the potentially benign, such as providing bogus user information like a negative age, to the serious, such as running scripts every time a user looks at a page that includes the information, potentially propagating the attack in the process, to the catastrophic, such as deleting all data in the server.
When writing filters to validate user input, it is imperative that filters always be whitelist-based, allowing known-safe constructs and disallowing all other input. Blacklist-based filters that disallow known-bad inputs and allow everything else are not secure, as not everything that is bad is yet known (for example, because it might be invented in the future).
For example, suppose a page looked at its URL's query string to determine what to display, and the site then redirected the user to that page to display a message, as in:
<ul> <li><a href="message.cgi?say=Hello">Say Hello</a> <li><a href="message.cgi?say=Welcome">Say Welcome</a> <li><a href="message.cgi?say=Kittens">Say Kittens</a> </ul>
If the message was just displayed to the user without escaping, a hostile attacker could then craft a URL that contained a script element:
http://example.com/message.cgi?say=%3Cscript%3Ealert%28%27Oh%20no%21%27%29%3C/script%3E
If the attacker then convinced a victim user to visit this page, a script of the attacker's choosing would run on the page. Such a script could do any number of hostile actions, limited only by what the site offers: if the site is an e-commerce shop, for instance, such a script could cause the user to unknowingly make arbitrarily many unwanted purchases.
This is called a cross-site scripting attack.
There are many constructs that can be used to try to trick a site into executing code. Here are some that authors are encouraged to consider when writing whitelist filters:
img, it is important to whitelist any provided
attributes as well. If one allowed all attributes then an
attacker could, for instance, use the onload attribute to run arbitrary
script.javascript:", but user agents
can implement (and indeed, have historically implemented)
others.base element to be inserted means any
script elements in the page with relative links can
be hijacked, and similarly that any form submissions can get
redirected to a hostile site.If a site allows a user to make form submissions with user-specific side-effects, for example posting messages on a forum under the user's name, making purchases, or applying for a passport, it is important to verify that the request was made by the user intentionally, rather than by another site tricking the user into making the request unknowingly.
This problem exists because HTML forms can be submitted to other origins.
Sites can prevent such attacks by populating forms with
user-specific hidden tokens, or by checking Origin headers on all requests.
A page that provides users with an interface to perform actions that the user might not wish to perform needs to be designed so as to avoid the possibility that users can be tricked into activating the interface.
One way that a user could be so tricked is if a hostile site
places the victim site in a small iframe and then
convinces the user to click, for instance by having the user play
a reaction game. Once the user is playing the game, the hostile
site can quickly position the iframe under the mouse cursor just
as the user is about to click, thus tricking the user into
clicking the victim site's interface.
To avoid this, sites that do not expect to be used in frames
are encouraged to only enable their interface if they detect that
they are not in a frame (e.g. by comparing the window object to the value of the top attribute).
Scripts in HTML have "run-to-completion" semantics, meaning that the browser will generally run the script uninterrupted before doing anything else, such as firing further events or continuing to parse the document.
On the other hand, parsing of HTML files happens asynchronously and incrementally, meaning that the parser can pause at any point to let scripts run. This is generally a good thing, but it does mean that authors need to be careful to avoid hooking event handlers after the events could have possibly fired.
There are two techniques for doing this reliably: use event handler content attributes, or create the element and add the event handlers in the same script. The latter is safe because, as mentioned earlier, scripts are run to completion before further events can fire.
One way this could manifest itself is with img
elements and the load event. The
event could fire as soon as the element has been parsed, especially
if the image has already been cached (which is common).
Here, the author uses the onload handler on an img
element to catch the load event:
<img src="games.png" alt="Games" onload="gamesLogoHasLoaded(event)">
If the element is being added by script, then so long as the event handlers are added in the same script, the event will still not be missed:
<script>
var img = new Image();
img.src = 'games.png';
img.alt = 'Games';
img.onload = gamesLogoHasLoaded;
// img.addEventListener('load', gamesLogoHasLoaded, false); // would work also
</script>
However, if the author first created the img
element and then in a separate script added the event listeners,
there's a chance that the load
event would be fired in between, leading it to be missed:
<!-- Do not use this style, it has a race condition! -->
<img id="games" src="games.png" alt="Games">
<!-- the 'load' event might fire here while the parser is taking a
break, in which case you will not see it! -->
<script>
var img = document.getElementById('games');
img.onload = gamesLogoHasLoaded; // might never fire!
</script>
Unlike previous versions of the HTML specification, this specification defines in some detail the required processing for invalid documents as well as valid documents.
However, even though the processing of invalid content is in most cases well-defined, conformance requirements for documents are still important: in practice, interoperability (the situation in which all implementations process particular content in a reliable and identical or equivalent way) is not the only goal of document conformance requirements. This section details some of the more common reasons for still distinguishing between a conforming document and one with errors.
The majority of presentational features from previous versions of HTML are no longer allowed. Presentational markup in general has been found to have a number of problems:
While it is possible to use presentational markup in a way that provides users of assistive technologies (ATs) with an acceptable experience (e.g. using ARIA), doing so is significantly more difficult than doing so when using semantically-appropriate markup. Furthermore, even using such techniques doesn't help make pages accessible for non-AT non-graphical users, such as users of text-mode browsers.
Using media-independent markup, on the other hand, provides an easy way for documents to be authored in such a way that they work for more users (e.g. text browsers).
It is significantly easier to maintain a site written in such a
way that the markup is style-independent. For example, changing
the color of a site that uses <font color="">
throughout requires changes across the entire site, whereas a
similar change to a site based on CSS can be done by changing a
single file.
Presentational markup tends to be much more redundant, and thus results in larger document sizes.
For those reasons, presentational markup has been removed from HTML in this version. This change should not come as a surprise; HTML4 deprecated presentational markup many years ago and provided a mode (HTML4 Transitional) to help authors move away from presentational markup; later, XHTML 1.1 went further and obsoleted those features altogether.
The only remaining presentational markup features in HTML are the
style attribute and the
style element. Use of the style attribute is somewhat discouraged in
production environments, but it can be useful for rapid prototyping
(where its rules can be directly moved into a separate style sheet
later) and for providing specific styles in unusual cases where a
separate style sheet would be inconvenient. Similarly, the
style element can be useful in syndication or for
page-specific styles, but in general an external style sheet is
likely to be more convenient when the styles apply to multiple
pages.
It is also worth noting that some elements that were previously
presentational have been redefined in this specification to be
media-independent: b, i, hr,
s, small, and u.
The syntax of HTML is constrained to avoid a wide variety of problems.
Certain invalid syntax constructs, when parsed, result in DOM trees that are highly unintuitive.
To allow user agents to be used in controlled environments without having to implement the more bizarre and convoluted error handling rules, user agents are permitted to fail whenever encountering a parse error.
Some error-handling behavior, such as the behavior for the
<table><hr>... example mentioned
above, are incompatible with streaming user agents (user agents
that process HTML files in one pass, without storing state). To
avoid interoperability problems with such user agents, any syntax
resulting in such behavior is considered invalid.
When a user agent based on XML is connected to an HTML parser, it is possible that certain invariants that XML enforces, such as comments never containing two consecutive hyphens, will be violated by an HTML file. Handling this can require that the parser coerce the HTML DOM into an XML-compatible infoset. Most syntax constructs that require such handling are considered invalid.
Certain syntax constructs can result in disproportionally poor performance. To discourage the use of such constructs, they are typically made non-conforming.
For example, the following markup results in poor performance,
since all the unclosed i elements have to be
reconstructed in each paragraph, resulting in progressively more
elements in each paragraph:
<p><i>He dreamt. <p><i>He dreamt that he ate breakfast. <p><i>Then lunch. <p><i>And finally dinner.
The resulting DOM for this fragment would be:
There are syntax constructs that, for historical reasons, are relatively fragile. To help reduce the number of users who accidentally run into such problems, they are made non-conforming.
For example, the parsing of certain named character references in attributes happens even with the closing semicolon being omitted. It is safe to include an ampersand followed by letters that do not form a named character reference, but if the letters are changed to a string that does form a named character reference, they will be interpreted as that character instead.
In this fragment, the attribute's value is "?bill&ted":
<a href="?bill&ted">Bill and Ted</a>
In the following fragment, however, the attribute's value is
actually "?art©", not the
intended "?art©", because even
without the final semicolon, "©" is
handled the same as "©" and thus
gets interpreted as "©":
<a href="?art©">Art and Copy</a>
To avoid this problem, all named character references are required to end with a semicolon, and uses of named character references without a semicolon are flagged as errors.
Thus, the correct way to express the above cases is as follows:
<a href="?bill&ted">Bill and Ted</a> <!-- &ted is ok, since it's not a named character reference -->
<a href="?art&copy">Art and Copy</a> <!-- the & has to be escaped, since © is a named character reference -->
Certain syntax constructs are known to cause especially subtle or serious problems in legacy user agents, and are therefore marked as non-conforming to help authors avoid them.
For example, this is why the U+0060 GRAVE ACCENT character (`) is not allowed in unquoted attributes. In certain legacy user agents, it is sometimes treated as a quote character.
Another example of this is the DOCTYPE, which is required to trigger no-quirks mode, because the behavior of legacy user agents in quirks mode is often largely undocumented.
Certain restrictions exist purely to avoid known security problems.
For example, the restriction on using UTF-7 exists purely to avoid authors falling prey to a known cross-site-scripting attack using UTF-7.
Markup where the author's intent is very unclear is often made non-conforming. Correcting these errors early makes later maintenance easier.
When a user makes a simple typo, it is helpful if the error can be caught early, as this can save the author a lot of debugging time. This specification therefore usually considers it an error to use element names, attribute names, and so forth, that do not match the names defined in this specification.
For example, if the author typed <capton>
instead of <caption>, this would be flagged as an
error and the author could correct the typo immediately.
In order to allow the language syntax to be extended in the future, certain otherwise harmless features are disallowed.
For example, "attributes" in end tags are ignored currently, but they are invalid, in case a future change to the language makes use of that syntax feature without conflicting with already-deployed (and valid!) content.
Some authors find it helpful to be in the practice of always quoting all attributes and always including all optional tags, preferring the consistency derived from such custom over the minor benefits of terseness afforded by making use of the flexibility of the HTML syntax. To aid such authors, conformance checkers can provide modes of operation wherein such conventions are enforced.
Beyond the syntax of the language, this specification also places restrictions on how elements and attributes can be specified. These restrictions are present for similar reasons:
To avoid misuse of elements with defined meanings, content models are defined that restrict how elements can be nested when such nestings would be of dubious value.
For example, this specification disallows
nesting a section element inside a kbd
element, since it is highly unlikely for an author to indicate
that an entire section should be keyed in.
Similarly, to draw the author's attention to mistakes in the use of elements, clear contradictions in the semantics expressed are also considered conformance errors.
In the fragments below, for example, the semantics are nonsensical: a separator cannot simultaneously be a cell, nor can a radio button be a progress bar.
<hr role="cell">
<input type=radio role=progressbar>
Another example is the restrictions on the
content models of the ul element, which only allows
li element children. Lists by definition consist just
of zero or more list items, so if a ul element
contains something other than an li element, it's not
clear what was meant.
Certain elements have default styles or behaviors that make certain combinations likely to lead to confusion. Where these have equivalent alternatives without this problem, the confusing combinations are disallowed.
For example, div elements are
rendered as block boxes, and span elements as inline
boxes. Putting a block box in an inline box is unnecessarily
confusing; since either nesting just div elements, or
nesting just span elements, or nesting
span elements inside div elements all
serve the same purpose as nesting a div element in a
span element, but only the latter involves a block
box in an inline box, the latter combination is disallowed.
Another example would be the way
interactive content cannot be nested. For example, a
button element cannot contain a textarea
element. This is because the default behavior of such nesting
interactive elements would be highly confusing to users. Instead
of nesting these elements, they can be placed side by side.
Sometimes, something is disallowed because allowing it would likely cause author confusion.
For example, setting the disabled attribute to the value
"false" is disallowed, because despite the
appearance of meaning that the element is enabled, it in fact
means that the element is disabled (what matters for
implementations is the presence of the attribute, not its
value).
Some conformance errors simplify the language that authors need to learn.
For example, the area element's
shape attribute, despite
accepting both circ and circle values in
practice as synonyms, disallows the use of the circ value, so as to
simplify tutorials and other learning aids. There would be no
benefit to allowing both, but it would cause extra confusion when
teaching the language.
Certain elements are parsed in somewhat eccentric ways (typically for historical reasons), and their content model restrictions are intended to avoid exposing the author to these issues.
For example, a form element isn't allowed inside
phrasing content, because when parsed as HTML, a
form element's start tag will imply a p
element's end tag. Thus, the following markup results in two
paragraphs, not one:
<p>Welcome. <form><label>Name:</label> <input></form>
It is parsed exactly like the following:
<p>Welcome. </p><form><label>Name:</label> <input></form>
Some errors are intended to help prevent script problems that would be hard to debug.
This is why, for instance, it is non-conforming
to have two id attributes with the
same value. Duplicate IDs lead to the wrong element being
selected, with sometimes disastrous effects whose cause is hard to
determine.
Some constructs are disallowed because historically they have been the cause of a lot of wasted authoring time, and by encouraging authors to avoid making them, authors can save time in future efforts.
For example, a script element's
src attribute causes the
element's contents to be ignored. However, this isn't obvious,
especially if the element's contents appear to be executable
script — which can lead to authors spending a lot of time
trying to debug the inline script without realizing that it is not
executing. To reduce this problem, this specification makes it
non-conforming to have executable script in a script
element when the src
attribute is present. This means that authors who are validating
their documents are less likely to waste time with this kind of
mistake.
Some authors like to write files that can be interpreted as both XML and HTML with similar results. Though this practice is discouraged in general due to the myriad of subtle complications involved (especially when involving scripting, styling, or any kind of automated serialization), this specification has a few restrictions intended to at least somewhat mitigate the difficulties. This makes it easier for authors to use this as a transitionary step when migrating between HTML and XHTML.
For example, there are somewhat complicated
rules surrounding the lang and
xml:lang attributes intended
to keep the two synchronized.
Another example would be the restrictions on
the values of xmlns attributes in the HTML
serialization, which are intended to ensure that elements in
conforming documents end up in the same namespaces whether
processed as HTML or XML.
As with the restrictions on the syntax intended to allow for new syntax in future revisions of the language, some restrictions on the content models of elements and values of attributes are intended to allow for future expansion of the HTML vocabulary.
For example, limiting the values of the target attribute that start
with an U+005F LOW LINE character (_) to only specific predefined
values allows new predefined values to be introduced at a future
time without conflicting with author-defined values.
Certain restrictions are intended to support the restrictions made by other specifications.
For example, requiring that attributes that take media queries use only valid media queries reinforces the importance of following the conformance rules of that specification.
The following documents might be of interest to readers of this specification.
This Architectural Specification provides authors of specifications, software developers, and content developers with a common reference for interoperable text manipulation on the World Wide Web, building on the Universal Character Set, defined jointly by the Unicode Standard and ISO/IEC 10646. Topics addressed include use of the terms 'character', 'encoding' and 'string', a reference processing model, choice and identification of character encodings, character escaping, and string indexing.
Because Unicode contains such a large number of characters and incorporates the varied writing systems of the world, incorrect usage can expose programs or systems to possible security attacks. This is especially important as more and more products are internationalized. This document describes some of the security considerations that programmers, system analysts, standards developers, and users should take into account, and provides specific recommendations to reduce the risk of problems.
Web Content Accessibility Guidelines (WCAG) 2.0 covers a wide range of recommendations for making Web content more accessible. Following these guidelines will make content accessible to a wider range of people with disabilities, including blindness and low vision, deafness and hearing loss, learning disabilities, cognitive limitations, limited movement, speech disabilities, photosensitivity and combinations of these. Following these guidelines will also often make your Web content more usable to users in general.
A document that uses polyglot markup is a document that is a stream of bytes that parses into identical document trees (with the exception of the xmlns attribute on the root element) when processed as HTML and when processed as XML. Polyglot markup that meets a well defined set of constraints is interpreted as compatible, regardless of whether they are processed as HTML or as XHTML, per the HTML5 specification. Polyglot markup uses a specific DOCTYPE, namespace declarations, and a specific case — normally lower case but occasionally camel case — for element and attribute names. Polyglot markup uses lower case for certain attribute values. Further constraints include those on empty elements, named entity references, and the use of scripts and style.
This is draft documentation mapping HTML elements and attributes to accessibility API Roles, States and Properties on a variety of platforms. It provides recommendations on deriving the accessible names and descriptions for HTML elements. It also provides accessible feature implementation examples.
This specification refers to both HTML and XML attributes and IDL attributes, often in the same context. When it is not clear which is being referred to, they are referred to as content attributes for HTML and XML attributes, and IDL attributes for those defined on IDL interfaces. Similarly, the term "properties" is used for both JavaScript object properties and CSS properties. When these are ambiguous they are qualified as object properties and CSS properties respectively.
Generally, when the specification states that a feature applies to the HTML syntax or the XHTML syntax, it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XHTML)".
This specification uses the term document to
refer to any use of HTML, ranging from short static documents to
long essays or reports with rich multimedia, as well as to
fully-fledged interactive applications. The term is used to refer
both to Document objects and their descendant DOM
trees, and to serialized byte streams using the HTML syntax or XHTML syntax, depending on context.
In the context of the DOM structures, the terms HTML document and XML
document are used as defined in the DOM Core specification,
and refer specifically to two different modes that
Document objects can find themselves in. [DOMCORE] (Such uses are always hyperlinked
to their definition.)
In the context of byte streams, the term HTML document refers to
resources labeled as text/html, and the term XML
document refers to resources labeled with an XML MIME
type.
The term XHTML document is used to refer to both
Documents in the XML
document mode that contains element nodes in the HTML
namespace, and byte streams labeled with an XML MIME
type that contain elements from the HTML
namespace, depending on context.
For simplicity, terms such as shown, displayed, and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.
The term "transparent black" refers to the color with red, green, blue, and alpha channels all set to zero.
The specification uses the term supported when referring to whether a user agent has an implementation capable of decoding the semantics of an external resource. A format or type is said to be supported if the implementation can process an external resource of that format or type without critical aspects of the resource being ignored. Whether a specific resource is supported can depend on what features of the resource's format are in use.
For example, a PNG image would be considered to be in a supported format if its pixel data could be decoded and rendered, even if, unbeknownst to the implementation, the image also contained animation data.
A MPEG4 video file would not be considered to be in a supported format if the compression format used was not supported, even if the implementation could determine the dimensions of the movie from the file's metadata.
What some specifications, in particular the HTTP and URI specifications, refer to as a representation is referred to in this specification as a resource. [HTTP] [RFC3986]
The term MIME type is used to refer to what is sometimes called an Internet media type in protocol literature. The term media type in this specification is used to refer to the type of media intended for presentation, as used by the CSS specifications. [RFC2046] [MQ]
A string is a valid MIME type if it matches the media-type rule defined in section 3.7 "Media Types"
of RFC 2616. In particular, a valid MIME type may
include MIME type parameters. [HTTP]
A string is a valid MIME type with no parameters if it
matches the media-type rule defined in section
3.7 "Media Types" of RFC 2616, but does not contain any U+003B
SEMICOLON characters (;). In other words, if it consists only of a
type and subtype, with no MIME Type parameters. [HTTP]
The term HTML MIME type is used to refer to the
MIME type text/html.
A resource's critical subresources are those that the
resource needs to have available to be correctly processed. Which
resources are considered critical or not is defined by the
specification that defines the resource's format. For CSS resources,
only @import rules introduce critical
subresources; other resources, e.g. fonts or backgrounds, are
not.
The term data:
URL refers to URLs that use the data: scheme. [RFC2397]
To ease migration from HTML to XHTML, UAs
conforming to this specification will place elements in HTML in the
http://www.w3.org/1999/xhtml namespace, at least for
the purposes of the DOM and CSS. The term "HTML
elements", when used in this specification, refers to any
element in that namespace, and thus refers to both HTML and XHTML
elements.
Except where otherwise stated, all elements defined or mentioned
in this specification are in the HTML namespace
("http://www.w3.org/1999/xhtml"), and all attributes
defined or mentioned in this specification have no namespace.
The term element type is used to refer to the class of
elements have a given local name and namespace. For example,
button elements are elements with the element type
button, meaning they have the local name "button" and (implicitly as defined above) the
HTML namespace.
Attribute names are said to be XML-compatible if they
match the Name production defined in XML, they contain no
U+003A COLON characters (:), and their first three characters are
not an ASCII case-insensitive match for the string
"xml". [XML]
The term XML MIME type is used to refer to the MIME types text/xml,
application/xml, and any MIME
type whose subtype ends with the four characters "+xml". [RFC3023]
The root element of a Document object is
that Document's first element child, if any. If it does
not have one then the Document has no root element.
The term root element, when not referring to a
Document object's root element, means the furthest
ancestor element node of whatever node is being discussed, or the
node itself if it has no ancestors. When the node is a part of the
document, then the node's root element is indeed the
document's root element; however, if the node is not currently part
of the document tree, the root element will be an orphaned node.
When an element's root element is the root
element of a Document object, it is said to be
in a Document. An element is said to have
been inserted into a
document when its root element changes and is now
the document's root element. Analogously, an element is
said to have been removed from a document when its root
element changes from being the document's root
element to being another element.
A node's home subtree is the subtree rooted at that
node's root element. When a node is in a
Document, its home subtree is that
Document's tree.
The Document of a Node (such as an
element) is the Document that the Node's
ownerDocument IDL
attribute returns. When a Node is in a
Document then that Document is
always the Node's Document, and the
Node's ownerDocument IDL attribute
thus always returns that Document.
The Document of a content attribute is the
Document of the attribute's element.
The term tree order means a pre-order, depth-first
traversal of DOM nodes involved (through the parentNode/childNodes relationship).
When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM.
A content attribute is said to change value only if its new value is different than its previous value; setting an attribute to a value it already has does not change it.
The term empty, when used of an attribute
value, Text node, or string, means that the length of
the text is zero (i.e. not even containing spaces or control
characters).
The construction "a Foo object", where
Foo is actually an interface, is sometimes used instead
of the more accurate "an object implementing the interface
Foo".
An IDL attribute is said to be getting when its value is being retrieved (e.g. by author script), and is said to be setting when a new value is assigned to it.
If a DOM object is said to be live, then the attributes and methods on that object operate on the actual underlying data, not a snapshot of the data.
In the contexts of events, the terms fire and
dispatch are used as
defined in the DOM Core specification: firing an event means to
create and dispatch it, and dispatching an event means to follow the
steps that propagate the event through the tree. The term trusted event is used to refer
to events whose isTrusted
attribute is initialized to true. [DOMCORE]
The term plugin refers to a user-agent defined set of
content handlers used by the user agent that can take part in the
user agent's rendering of a Document object, but that
neither act as child browsing
contexts of the Document nor introduce any
Node objects to the Document's DOM.
Typically such content handlers are provided by third parties, though a user agent can also designate built-in content handlers as plugins.
One example of a plugin would be a PDF viewer that is instantiated in a browsing context when the user navigates to a PDF file. This would count as a plugin regardless of whether the party that implemented the PDF viewer component was the same as that which implemented the user agent itself. However, a PDF viewer application that launches separate from the user agent (as opposed to using the same interface) is not a plugin by this definition.
This specification does not define a mechanism for interacting with plugins, as it is expected to be user-agent- and platform-specific. Some UAs might opt to support a plugin mechanism such as the Netscape Plugin API; others might use remote content converters or have built-in support for certain types. Indeed, this specification doesn't require user agents to support plugins at all. [NPAPI]
A plugin can be secured
if it honors the semantics of the sandbox attribute.
For example, a secured plugin would prevent its
contents from creating pop-up windows when the plugin is
instantiated inside a sandboxed iframe.
The preferred MIME name of a character encoding is the name or alias labeled as "preferred MIME name" in the IANA Character Sets registry, if there is one, or the encoding's name, if none of the aliases are so labeled. [IANACHARSET]
An ASCII-compatible character encoding is a single-byte or variable-length encoding in which the bytes 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A, ignoring bytes that are the second and later bytes of multibyte sequences, all correspond to single-byte sequences that map to the same Unicode characters as those bytes in ANSI_X3.4-1968 (US-ASCII). [RFC1345]
This includes such encodings as Shift_JIS, HZ-GB-2312, and variants of ISO-2022, even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences that are unrelated to their interpretation as ASCII. It excludes such encodings as UTF-7, UTF-16, GSM03.38, and EBCDIC variants.
The term a UTF-16 encoding refers to any variant of UTF-16: self-describing UTF-16 with a BOM, ambiguous UTF-16 without a BOM, raw UTF-16LE, and raw UTF-16BE. [RFC2781]
The term code unit is used as defined in the Web IDL
specification: a 16 bit unsigned integer, the smallest atomic
component of a DOMString. (This is a narrower
definition than the one used in Unicode.) [WEBIDL]
The term Unicode code point means a Unicode scalar value where possible, and an isolated surrogate code point when not. When a conformance requirement is defined in terms of characters or Unicode code points, a pair of code units consisting of a high surrogate followed by a low surrogate must be treated as the single code point represented by the surrogate pair, but isolated surrogates must each be treated as the single code point with the value of the surrogate. [UNICODE]
In this specification, the term character, when not qualified as Unicode character, is synonymous with the term Unicode code point.
The term Unicode character is used to mean a Unicode scalar value (i.e. any Unicode code point that is not a surrogate code point). [UNICODE]
The code-point length of a string is the number of code units in that string.
This complexity results from the historical decision to define the DOM API in terms of 16 bit (UTF-16) code units, rather than in terms of Unicode characters.
All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.
The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in RFC2119. The key word "OPTIONALLY" in the normative parts of this document is to be interpreted with the same normative meaning as "MAY" and "OPTIONAL". For readability, these words do not appear in all uppercase letters in this specification. [RFC2119]
HTML has a wide number of extensibility mechanisms that can be used for adding semantics in a safe manner:
class
attribute to extend elements, effectively creating their own
elements, while using the most applicable existing "real" HTML
element, so that browsers and other tools that don't know of the
extension can still support it somewhat well. This is the tack used
by microformats, for example.data-*="" attributes. These are
guaranteed to never be touched by browsers, and allow scripts to
include data on HTML elements that scripts can then look for and
process.<meta name=""
content=""> mechanism to include page-wide metadata by
registering extensions to the
predefined set of metadata names.rel="" mechanism to annotate
links with specific meanings by registering extensions to the predefined set of
link types. This is also used by microformats.<script type=""> mechanism with a custom
type, for further handling by inline or server-side scripts.embed element. This is how Flash
works.When adding new reflecting IDL
attributes corresponding to content attributes of the form "x-vendor-feature", the IDL attribute should be named
"vendorFeature" (i.e. the "x"
is dropped from the IDL attribute's name).
Comparing two strings in a case-sensitive manner means comparing them exactly, code point for code point.
Comparing two strings in an ASCII case-insensitive manner means comparing them exactly, code point for code point, except that the characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) are considered to also match.
Comparing two strings in a compatibility caseless manner means using the Unicode compatibility caseless match operation to compare the two strings. [UNICODE]
Except where otherwise stated, string comparisons must be performed in a case-sensitive manner.
A string pattern is a prefix match for a string s when pattern is not longer than s and truncating s to pattern's length leaves the two strings as matches of each other.
There are various places in HTML that accept particular data types, such as dates or numbers. This section describes what the conformance criteria for content in those formats is, and how to parse them.
The space characters, for the purposes of this specification, are U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), and U+000D CARRIAGE RETURN (CR).
The White_Space characters are
those that have the Unicode property "White_Space" in the Unicode
PropList.txt data file. [UNICODE]
This should not be confused with the "White_Space"
value (abbreviated "WS") of the "Bidi_Class" property in the Unicode.txt data file.
A number of attributes are boolean attributes. The presence of a boolean attribute on an element represents the true value, and the absence of the attribute represents the false value.
If the attribute is present, its value must either be the empty string or a value that is an ASCII case-insensitive match for the attribute's canonical name, with no leading or trailing whitespace.
The values "true" and "false" are not allowed on boolean attributes. To represent a false value, the attribute has to be omitted altogether.
Here is an example of a checkbox that is checked and disabled.
The checked and disabled attributes are the
boolean attributes.
<label><input type=checkbox checked name=cheese disabled> Cheese</label>
This could be equivalently written as this:
<label><input type=checkbox checked=checked name=cheese disabled=disabled> Cheese</label>
You can also mix styles; the following is still equivalent:
<label><input type='checkbox' checked name=cheese disabled=""> Cheese</label>
Some attributes are defined as taking one of a finite set of keywords. Such attributes are called enumerated attributes. The keywords are each defined to map to a particular state (several keywords might map to the same state, in which case some of the keywords are synonyms of each other; additionally, some of the keywords can be said to be non-conforming, and are only in the specification for historical reasons). In addition, two default states can be given. The first is the invalid value default, the second is the missing value default.
If an enumerated attribute is specified, the attribute's value must be an ASCII case-insensitive match for one of the given keywords that are not said to be non-conforming, with no leading or trailing whitespace.
When the attribute is specified, if its value is an ASCII case-insensitive match for one of the given keywords then that keyword's state is the state that the attribute represents. If the attribute value matches none of the given keywords, but the attribute has an invalid value default, then the attribute represents that state. Otherwise, if the attribute value matches none of the keywords but there is a missing value default state defined, then that is the state represented by the attribute. Otherwise, there is no default, and invalid values must be ignored.
When the attribute is not specified, if there is a missing value default state defined, then that is the state represented by the (missing) attribute. Otherwise, the absence of the attribute means that there is no state represented.
The empty string can be a valid keyword.
A string is a valid integer if it consists of one or more characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), optionally prefixed with a U+002D HYPHEN-MINUS character (-).
A valid integer without a U+002D HYPHEN-MINUS (-) prefix represents the number that is represented in base ten by that string of digits. A valid integer with a U+002D HYPHEN-MINUS (-) prefix represents the number represented in base ten by the string of digits that follows the U+002D HYPHEN-MINUS, subtracted from zero.
A string is a valid non-negative integer if it consists of one or more characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9).
A valid non-negative integer represents the number that is represented in base ten by that string of digits.
A string is a valid floating point number if it consists of:
A valid floating point number represents the number obtained by multiplying the significand by ten raised to the power of the exponent, where the significand is the first number, interpreted as base ten (including the decimal point and the number after the decimal point, if any, and interpreting the significand as a negative number if the whole string starts with a U+002D HYPHEN-MINUS character (-) and the number is not zero), and where the exponent is the number after the E, if any (interpreted as a negative number if there is a U+002D HYPHEN-MINUS character (-) between the E and the number and the number is not zero, or else ignoring a U+002B PLUS SIGN character (+) between the E and the number if there is one). If there is no E, then the exponent is treated as zero.
The Infinity and Not-a-Number (NaN) values are not valid floating point numbers.
A valid list of integers is a number of valid integers separated by U+002C COMMA characters, with no other characters (e.g. no space characters). In addition, there might be restrictions on the number of integers that can be given, or on the range of values allowed.
In the algorithms below, the number of days in month month of year year is: 31 if month is 1, 3, 5, 7, 8, 10, or 12; 30 if month is 4, 6, 9, or 11; 29 if month is 2 and year is a number divisible by 400, or if year is a number divisible by 4 but not by 100; and 28 otherwise. This takes into account leap years in the Gregorian calendar. [GREGORIAN]
The digits in the date and time syntaxes defined in this section must be characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), used to express numbers in base ten.
A month consists of a specific proleptic Gregorian date with no time-zone information and no date information beyond a year and a month. [GREGORIAN]
A string is a valid month string representing a year year and month month if it consists of the following components in the given order:
A date consists of a specific proleptic Gregorian date with no time-zone information, consisting of a year, a month, and a day. [GREGORIAN]
A string is a valid date string representing a year year, month month, and day day if it consists of the following components in the given order:
A yearless date consists of a month and a day, but with no associated year.
A string is a valid yearless date string representing a month month and a day day if it consists of the following components in the given order:
In other words, if the month is
"02", meaning February, then the day can be
29, as if the year was a leap year.
A time consists of a specific time with no time-zone information, consisting of an hour, a minute, a second, and a fraction of a second.
A string is a valid time string representing an hour hour, a minute minute, and a second second if it consists of the following components in the given order:
The second component cannot be 60 or 61; leap seconds cannot be represented.
A local date and time consists of a specific proleptic Gregorian date, consisting of a year, a month, and a day, and a time, consisting of an hour, a minute, a second, and a fraction of a second, but expressed without a time zone. [GREGORIAN]
A string is a valid local date and time string representing a date and time if it consists of the following components in the given order:
A string is a valid normalized local date and time string representing a date and time if it consists of the following components in the given order:
A time-zone offset consists of a signed number of hours and minutes.
A string is a valid time-zone offset string representing a time-zone offset if it consists of either:
A U+005A LATIN CAPITAL LETTER Z character (Z), allowed only if the time zone is UTC
Or, the following components, in the given order:
This format allows for time-zone offsets from -23:59 to +23:59. In practice, however, the range of offsets of actual time zones is -12:00 to +14:00, and the minutes component of offsets of actual time zones is always either 00, 30, or 45.
See also the usage notes and examples in the global date and time section below for details on using time-zone offsets with historical times that predate the formation of formal time zones.
A global date and time consists of a specific proleptic Gregorian date, consisting of a year, a month, and a day, and a time, consisting of an hour, a minute, a second, and a fraction of a second, expressed with a time-zone offset, consisting of a signed number of hours and minutes. [GREGORIAN]
A string is a valid global date and time string representing a date, time, and a time-zone offset if it consists of the following components in the given order:
Times in dates before the formation of UTC in the mid twentieth century must be expressed and interpreted in terms of UT1 (contemporary Earth solar time at the 0° longitude), not UTC (the approximation of UT1 that ticks in SI seconds). Time before the formation of time zones must be expressed and interpeted as UT1 times with explicit time zones that approximate the contemporary difference between the appropriate local time and the time observed at the location of Greenwich, London.
The following are some examples of dates written as valid global date and time strings.
0037-12-13 00:00Z"1979-10-14T12:00:00.001-04:00"8592-01-01T02:09+02:09"Several things are notable about these dates:
T" is replaced by a space, it
must be a single space character. The string "2001-12-21 12:00Z" (with two spaces
between the components) would not be parsed successfully.A string is a valid normalized forced-UTC global date and time string representing a date, time, and a time-zone offset if it consists of the following components in the given order:
A week consists of a week-year number and a week number representing a seven-day period starting on a Monday. Each week-year in this calendaring system has either 52 or 53 such seven-day periods, as defined below. The seven-day period starting on the Gregorian date Monday December 29th 1969 (1969-12-29) is defined as week number 1 in week-year 1970. Consecutive weeks are numbered sequentially. The week before the number 1 week in a week-year is the last week in the previous week-year, and vice versa. [GREGORIAN]
A week-year with a number year has 53 weeks if it corresponds to either a year year in the proleptic Gregorian calendar that has a Thursday as its first day (January 1st), or a year year in the proleptic Gregorian calendar that has a Wednesday as its first day (January 1st) and where year is a number divisible by 400, or a number divisible by 4 but not by 100. All other week-years have 52 weeks.
The week number of the last day of a week-year with 53 weeks is 53; the week number of the last day of a week-year with 52 weeks is 52.
The week-year number of a particular day can be different than the number of the year that contains that day in the proleptic Gregorian calendar. The first week in a week-year y is the week that contains the first Thursday of the Gregorian year y.
A string is a valid week string representing a week-year year and week week if it consists of the following components in the given order:
A duration consists of a number of seconds.
Since months and seconds are not comparable (a month is not a precise number of seconds, but is instead a period whose exact length depends on the precise day from which it is measured) a duration as defined in this specification cannot include months (or years, which are equivalent to twelve months). Only durations that describe a specific number of seconds can be described.
A string is a valid duration string representing a duration t if it consists of either of the following:
A literal U+0050 LATIN CAPITAL LETTER P character followed by one or more of the following subcomponents, in the order given, where the number of days, hours, minutes, and seconds corresponds to the same number of seconds as in t:
One or more digits followed by a U+0044 LATIN CAPITAL LETTER D character, representing a number of days.
A U+0054 LATIN CAPITAL LETTER T character followed by one or more of the following subcomponents, in the order given:
This, as with a number of other date- and time-related microsyntaxes defined in this specification, is based on one of the formats defined in ISO 8601. [ISO8601]
One or more duration time components, each with a different duration time component scale, in any order; the sum of the represented seconds being equal to the number of seconds in t.
A duration time component is a string consisting of the following components:
Zero or more space characters.
One or more digits, representing a number of time units, scaled by the duration time component scale specified (see below) to represent a number of seconds.
If the duration time component scale specified is 1 (i.e. the units are seconds), then, optionally, a U+002E FULL STOP character (.) followed by one, two, or three digits, representing a fraction of a second.
Zero or more space characters.
One of the following characters, representing the duration time component scale of the time unit used in the numeric part of the duration time component:
Zero or more space characters.
This is not based on any of the formats in ISO 8601. It is intended to be a more human-readable alternative to the ISO 8601 duration format.
A string is a valid date string with optional time if it is also one of the following:
A simple color consists of three 8-bit numbers in the range 0..255, representing the red, green, and blue components of the color respectively, in the sRGB color space. [SRGB]
A string is a valid simple color if it is exactly seven characters long, and the first character is a U+0023 NUMBER SIGN character (#), and the remaining six characters are all in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F, U+0061 LATIN SMALL LETTER A to U+0066 LATIN SMALL LETTER F, with the first two digits representing the red component, the middle two digits representing the green component, and the last two digits representing the blue component, in hexadecimal.
A string is a valid lowercase simple color if it is a valid simple color and doesn't use any characters in the range U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F.
A set of space-separated tokens is a string containing zero or more words (known as tokens) separated by one or more space characters, where words consist of any string of one or more characters, none of which are space characters.
A string containing a set of space-separated tokens may have leading or trailing space characters.
An unordered set of unique space-separated tokens is a set of space-separated tokens where none of the tokens are duplicated.
An ordered set of unique space-separated tokens is a set of space-separated tokens where none of the tokens are duplicated but where the order of the tokens is meaningful.
Sets of space-separated tokens sometimes have a defined set of allowed values. When a set of allowed values is defined, the tokens must all be from that list of allowed values; other values are non-conforming. If no such set of allowed values is provided, then all values are conforming.
How tokens in a set of space-separated tokens are to be compared (e.g. case-sensitively or not) is defined on a per-set basis.
A set of comma-separated tokens is a string containing zero or more tokens each separated from the next by a single U+002C COMMA character (,), where tokens consist of any string of zero or more characters, neither beginning nor ending with space characters, nor containing any U+002C COMMA characters (,), and optionally surrounded by space characters.
For instance, the string " a ,b,,d d " consists of four
tokens: "a", "b", the empty string, and "d d". Leading and
trailing whitespace around each token doesn't count as part of the
token, and the empty string can be a token.
Sets of comma-separated tokens sometimes have further restrictions on what consists a valid token. When such restrictions are defined, the tokens must all fit within those restrictions; other values are non-conforming. If no such restrictions are specified, then all values are conforming.
A valid hash-name reference to an element of type type is a string consisting of a U+0023 NUMBER SIGN
character (#) followed by a string which exactly matches the value
of the name attribute of an element with type
type in the document.
A string is a valid media query if it matches the
media_query_list production of the Media
Queries specification. [MQ]
A string matches the environment of the user if it is the empty string, a string consisting of only space characters, or is a media query that matches the user's environment according to the definitions given in the Media Queries specification. [MQ]
This specification defines the term URL, and defines various algorithms for dealing with URLs, because for historical reasons the rules defined by the URI and IRI specifications are not a complete description of what HTML user agents need to implement to be compatible with Web content.
The term "URL" in this specification is used in a manner distinct from the precise technical meaning it is given in RFC 3986. Readers familiar with that RFC will find it easier to read this specification if they pretend the term "URL" as used herein is really called something else altogether. This is a willful violation of RFC 3986. [RFC3986]
A URL is a string used to identify a resource.
A URL is a valid URL if at least one of the following conditions holds:
The URL is a valid IRI reference and it has no query component. [RFC3987]
The URL is a valid IRI reference and its query component contains no unescaped non-ASCII characters. [RFC3987]
The URL is a valid IRI reference and the character encoding of
the URL's Document is UTF-8 or a UTF-16
encoding. [RFC3987]
A string is a valid non-empty URL if it is a valid URL but it is not the empty string.
A string is a valid URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid URL.
A string is a valid non-empty URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid non-empty URL.
This specification defines the URL
about:legacy-compat as a reserved, though
unresolvable, about: URI, for use in DOCTYPEs in HTML
documents when needed for compatibility with XML tools. [ABOUT]
This specification defines the URL
about:srcdoc as a reserved, though
unresolvable, about: URI, that is used as
the document's address of iframe srcdoc documents. [ABOUT]
Resolving a URL is the process of taking a relative URL and obtaining the absolute URL that it implies.
A URL is an absolute URL if resolving it results in the same output regardless of what it is resolved relative to, and that output is not a failure.
An absolute URL is a hierarchical URL if, when resolved and then parsed, there is a character immediately after the <scheme> component and it is a U+002F SOLIDUS character (/).
An absolute URL is an authority-based URL if, when resolved and then parsed, there are two characters immediately after the <scheme> component and they are both U+002F SOLIDUS characters (//).
An interface that has a complement of URL decomposition IDL attributes has seven attributes with the following definitions:
attribute DOMString protocol;
attribute DOMString host;
attribute DOMString hostname;
attribute DOMString port;
attribute DOMString pathname;
attribute DOMString search;
attribute DOMString hash;protocol [ = value ]Returns the current scheme of the underlying URL.
Can be set, to change the underlying URL's scheme.
host [ = value ]Returns the current host and port (if it's not the default port) in the underlying URL.
Can be set, to change the underlying URL's host and port.
The host and the port are separated by a colon. The port part, if omitted, will be assumed to be the current scheme's default port.
hostname [ = value ]Returns the current host in the underlying URL.
Can be set, to change the underlying URL's host.
port [ = value ]Returns the current port in the underlying URL.
Can be set, to change the underlying URL's port.
pathname [ = value ]Returns the current path in the underlying URL.
Can be set, to change the underlying URL's path.
search [ = value ]Returns the current query component in the underlying URL.
Can be set, to change the underlying URL's query component.
hash [ = value ]Returns the current fragment identifier in the underlying URL.
Can be set, to change the underlying URL's fragment identifier.
The table below demonstrates how the getter for search results in different results
depending on the exact original syntax of the URL:
| Input URL | search value
| Explanation |
|---|---|---|
http://example.com/
| empty string | No <query> component in input URL. |
http://example.com/?
| ?
| There is a <query> component, but it is empty. |
http://example.com/?test
| ?test
| The <query> component has the value "test".
|
http://example.com/?test#
| ?test
| The (empty) <fragment> component is not part of the <query> component. |
The following table is similar; it provides a list of what each of the URL decomposition IDL attributes returns for a given input URL.
| Input | protocol
| host
| hostname
| port
| pathname
| search
| hash
|
|---|---|---|---|---|---|---|---|
http://example.com/carrot#question%3f
| http:
| example.com
| example.com
| (empty string) | /carrot
| (empty string) | #question%3f
|
https://www.example.com:4443?
| https:
| www.example.com:4443
| www.example.com
| 4443
| /
| ?
| (empty string) |
A CORS settings attribute is an enumerated attribute. The following table lists the keywords and states for the attribute — the keywords in the left column map to the states in the cell in the second column on the same row as the keyword.
| Keyword | State | Brief description |
|---|---|---|
anonymous
| Anonymous | Cross-origin CORS requests for the element will not have the credentials flag set. |
use-credentials
| Use Credentials | Cross-origin CORS requests for the element will have the credentials flag set. |
The empty string is also a valid keyword, and maps to the Anonymous state. The attribute's invalid value default is the Anonymous state. The missing value default, used when the attribute is omitted, is the No CORS state.
Some IDL attributes are defined to reflect a particular content attribute. This means that on getting, the IDL attribute returns the current value of the content attribute, and on setting, the IDL attribute changes the value of the content attribute to the given value.
The HTMLAllCollection,
HTMLFormControlsCollection,
HTMLOptionsCollection,
interfaces are collections derived from the
HTMLCollection interface.
The HTMLAllCollection interface represents a generic
collection of elements just like
HTMLCollection, with the exception that its namedItem() method
returns an HTMLAllCollection object when there are
multiple matching elements, and that its item() method can be used
as a synonym for its namedItem()
method.
interface HTMLAllCollection : HTMLCollection {
// inherits length and item(unsigned long index)
object? item(DOMString name);
legacycaller getter object? namedItem(DOMString name); // overrides inherited namedItem()
HTMLAllCollection tags(DOMString tagName);
};lengthReturns the number of elements in the collection.
item(index)Returns the item with index index from the collection. The items are sorted in tree order.
item(name)item(name)namedItem(name)namedItem(name)Returns the item with ID or name name from the collection.
If there are multiple matching items, then an HTMLAllCollection object containing all those elements is returned.
Only a, applet, area,
embed, form, frame,
frameset, iframe, img, and
object elements can have a name for the purpose of
this method; their name is given by the value of their name attribute.
tags(tagName)Returns a collection that is a filtered view of the current collection, containing only elements with the given tag name.
The HTMLFormControlsCollection interface represents
a collection of listed elements in form
and fieldset elements.
interface HTMLFormControlsCollection : HTMLCollection {
// inherits length and item()
legacycaller getter object? namedItem(DOMString name); // overrides inherited namedItem()
};
interface RadioNodeList : NodeList {
attribute DOMString value;
};lengthReturns the number of elements in the collection.
item(index)Returns the item with index index from the collection. The items are sorted in tree order.
namedItem(name)namedItem(name)Returns the item with ID or name name from the collection.
If there are multiple matching items, then a RadioNodeList object containing all those elements is returned.
Returns the value of the first checked radio button represented by the object.
Can be set, to check the first radio button with the given value represented by the object.
The HTMLOptionsCollection interface represents a
list of option elements. It is always rooted on a
select element and has attributes and methods that
manipulate that element's descendants.
interface HTMLOptionsCollection : HTMLCollection {
// inherits item()
attribute unsigned long length; // overrides inherited length
legacycaller getter object? namedItem(DOMString name); // overrides inherited namedItem()
setter creator void (unsigned long index, HTMLOptionElement option);
void add(HTMLOptionElement element, optional HTMLElement? before);
void add(HTMLOptGroupElement element, optional HTMLElement? before);
void add(HTMLOptionElement element, long before);
void add(HTMLOptGroupElement element, long before);
void remove(long index);
attribute long selectedIndex;
};length [ = value ]Returns the number of elements in the collection.
When set to a smaller number, truncates the number of option elements in the corresponding container.
When set to a greater number, adds new blank option elements to that container.
item(index)Returns the item with index index from the collection. The items are sorted in tree order.
namedItem(name)namedItem(name)Returns the item with ID or name name from the collection.
If there are multiple matching items, then a NodeList object containing all those elements is returned.
add(element [, before ] )Inserts element before the node given by before.
The before argument can be a number, in which case element is inserted before the item with that number, or an element from the collection, in which case element is inserted before that element.
If before is omitted, null, or a number out of range, then element will be added at the end of the list.
This method will throw a HierarchyRequestError
exception if element is an ancestor of the
element into which it is to be inserted.
selectedIndex [ = value ]Returns the index of the first selected item, if any, or −1 if there is no selected item.
Can be set, to change the selection.
The DOMStringMap interface represents a set of
name-value pairs. It exposes these using the scripting language's
native mechanisms for property access.
The dataset attribute on
elements exposes the data-*
attributes on the element.
Given the following fragment and elements with similar constructions:
<img class="tower" id="tower5" data-x="12" data-y="5"
data-ai="robotarget" data-hp="46" data-ability="flames"
src="towers/rocket.png alt="Rocket Tower">
...one could imagine a function splashDamage() that takes some arguments, the first
of which is the element to process:
function splashDamage(node, x, y, damage) {
if (node.classList.contains('tower') && // checking the 'class' attribute
node.dataset.x == x && // reading the 'data-x' attribute
node.dataset.y == y) { // reading the 'data-y' attribute
var hp = parseInt(node.dataset.hp); // reading the 'data-hp' attribute
hp = hp - damage;
if (hp < 0) {
hp = 0;
node.dataset.ai = 'dead'; // setting the 'data-ai' attribute
delete node.dataset.ability; // removing the 'data-ability' attribute
}
node.dataset.hp = hp; // setting the 'data-hp' attribute
}
}
Some objects support being copied and closed in one operation. This is called transferring the object, and is used in particular to transfer ownership of unsharable or expensive resources across worker boundaries.
[NoInterfaceObject]
interface Transferable { };The following Transferable types exist:
MessagePort
DOM3 Core defines mechanisms for checking for interface support, and for obtaining implementations of interfaces, using feature strings. [DOMCORE]
Authors are strongly discouraged from using these, as they are notoriously unreliable and imprecise. Authors are encouraged to rely on explicit feature testing or the graceful degradation behavior intrinsic to some of the features in this specification.
The HTML namespace is: http://www.w3.org/1999/xhtml
The MathML namespace is: http://www.w3.org/1998/Math/MathML
The SVG namespace is: http://www.w3.org/2000/svg
The XLink namespace is: http://www.w3.org/1999/xlink
The XML namespace is: http://www.w3.org/XML/1998/namespace
The XMLNS namespace is: http://www.w3.org/2000/xmlns/
Data mining tools and other user agents that perform operations on content without running scripts, evaluating CSS or XPath expressions, or otherwise exposing the resulting DOM to arbitrary content, may "support namespaces" by just asserting that their DOM node analogues are in certain namespaces, without actually exposing the above strings.
In the HTML syntax, namespace prefixes and namespace declarations do not have the same effect as in XML. For instance, the colon has no special meaning in HTML element names.
Every XML and HTML document in an HTML UA is represented by a
Document object. [DOMCORE]
The document's address is an absolute URL
that is set when the Document is created. The
document's current address is an absolute URL
that can change during the lifetime of the Document,
for example when the user navigates to
a fragment identifier on the
page or when the pushState() method is called
with a new URL.
Interactive user agents typically expose the document's current address in their user interface.
When a Document is created by a script using the createDocument()
or createHTMLDocument()
APIs, the document's address is the same as the
document's address of the script's document, and
the Document is both ready for post-load
tasks and completely loaded immediately.
Each Document object has a reload override
flag that is originally unset. The flag is set by the document.open() and document.write() methods in certain
situations. When the flag is set, the Document also has
a reload override buffer which is a Unicode string that
is used as the source of the document when it is reloaded.
When the user agent is to perform an overridden reload, it must act as follows:
Let source be the value of the browsing context's active document's reload override buffer.
Navigate the
browsing context to a resource whose source is source, with replacement enabled. When
the navigate algorithm creates a Document
object for this purpose, set that Document's
reload override flag and set its reload override
buffer to source.
The DOM Core specification defines a Document interface, which this specification
extends significantly:
[OverrideBuiltins]
partial interface Document {
// resource metadata management
[PutForwards=href] readonly attribute Location? location;
readonly attribute DOMString URL;
attribute DOMString domain;
readonly attribute DOMString referrer;
attribute DOMString cookie;
readonly attribute DOMString lastModified;
readonly attribute DOMString readyState;
// DOM tree accessors
getter object (DOMString name);
attribute DOMString title;
attribute DOMString dir;
attribute HTMLElement? body;
readonly attribute HTMLHeadElement? head;
readonly attribute HTMLCollection images;
readonly attribute HTMLCollection embeds;
readonly attribute HTMLCollection plugins;
readonly attribute HTMLCollection links;
readonly attribute HTMLCollection forms;
readonly attribute HTMLCollection scripts;
NodeList getElementsByName(DOMString elementName);
// dynamic markup insertion
Document open(optional DOMString type, optional DOMString replace);
WindowProxy open(DOMString url, DOMString name, DOMString features, optional boolean replace);
void close();
void write(DOMString... text);
void writeln(DOMString... text);
// user interaction
readonly attribute WindowProxy? defaultView;
readonly attribute Element? activeElement;
boolean hasFocus();
attribute DOMString designMode;
boolean execCommand(DOMString commandId);
boolean execCommand(DOMString commandId, boolean showUI);
boolean execCommand(DOMString commandId, boolean showUI, DOMString value);
boolean queryCommandEnabled(DOMString commandId);
boolean queryCommandIndeterm(DOMString commandId);
boolean queryCommandState(DOMString commandId);
boolean queryCommandSupported(DOMString commandId);
DOMString queryCommandValue(DOMString commandId);
readonly attribute HTMLCollection commands;
// event handler IDL attributes
[TreatNonCallableAsNull] attribute Function? onabort;
[TreatNonCallableAsNull] attribute Function? onblur;
[TreatNonCallableAsNull] attribute Function? oncanplay;
[TreatNonCallableAsNull] attribute Function? oncanplaythrough;
[TreatNonCallableAsNull] attribute Function? onchange;
[TreatNonCallableAsNull] attribute Function? onclick;
[TreatNonCallableAsNull] attribute Function? oncontextmenu;
[TreatNonCallableAsNull] attribute Function? oncuechange;
[TreatNonCallableAsNull] attribute Function? ondblclick;
[TreatNonCallableAsNull] attribute Function? ondrag;
[TreatNonCallableAsNull] attribute Function? ondragend;
[TreatNonCallableAsNull] attribute Function? ondragenter;
[TreatNonCallableAsNull] attribute Function? ondragleave;
[TreatNonCallableAsNull] attribute Function? ondragover;
[TreatNonCallableAsNull] attribute Function? ondragstart;
[TreatNonCallableAsNull] attribute Function? ondrop;
[TreatNonCallableAsNull] attribute Function? ondurationchange;
[TreatNonCallableAsNull] attribute Function? onemptied;
[TreatNonCallableAsNull] attribute Function? onended;
[TreatNonCallableAsNull] attribute Function? onerror;
[TreatNonCallableAsNull] attribute Function? onfocus;
[TreatNonCallableAsNull] attribute Function? oninput;
[TreatNonCallableAsNull] attribute Function? oninvalid;
[TreatNonCallableAsNull] attribute Function? onkeydown;
[TreatNonCallableAsNull] attribute Function? onkeypress;
[TreatNonCallableAsNull] attribute Function? onkeyup;
[TreatNonCallableAsNull] attribute Function? onload;
[TreatNonCallableAsNull] attribute Function? onloadeddata;
[TreatNonCallableAsNull] attribute Function? onloadedmetadata;
[TreatNonCallableAsNull] attribute Function? onloadstart;
[TreatNonCallableAsNull] attribute Function? onmousedown;
[TreatNonCallableAsNull] attribute Function? onmousemove;
[TreatNonCallableAsNull] attribute Function? onmouseout;
[TreatNonCallableAsNull] attribute Function? onmouseover;
[TreatNonCallableAsNull] attribute Function? onmouseup;
[TreatNonCallableAsNull] attribute Function? onmousewheel;
[TreatNonCallableAsNull] attribute Function? onpause;
[TreatNonCallableAsNull] attribute Function? onplay;
[TreatNonCallableAsNull] attribute Function? onplaying;
[TreatNonCallableAsNull] attribute Function? onprogress;
[TreatNonCallableAsNull] attribute Function? onratechange;
[TreatNonCallableAsNull] attribute Function? onreset;
[TreatNonCallableAsNull] attribute Function? onscroll;
[TreatNonCallableAsNull] attribute Function? onseeked;
[TreatNonCallableAsNull] attribute Function? onseeking;
[TreatNonCallableAsNull] attribute Function? onselect;
[TreatNonCallableAsNull] attribute Function? onshow;
[TreatNonCallableAsNull] attribute Function? onstalled;
[TreatNonCallableAsNull] attribute Function? onsubmit;
[TreatNonCallableAsNull] attribute Function? onsuspend;
[TreatNonCallableAsNull] attribute Function? ontimeupdate;
[TreatNonCallableAsNull] attribute Function? onvolumechange;
[TreatNonCallableAsNull] attribute Function? onwaiting;
// special event handler IDL attributes that only apply to Document objects
[TreatNonCallableAsNull,LenientThis] attribute Function? onreadystatechange;
};User agents throw a
SecurityError exception whenever any properties of a
Document object are accessed by scripts whose
effective script origin is not the same as the Document's effective
script origin.
URLReturns the document's address.
referrerReturns the
address of the Document from which the user
navigated to this one, unless it was blocked or there was no such
document, in which case it returns the empty string.
The noreferrer link
type can be used to block the referrer.
In the case of HTTP, the referrer IDL attribute will
match the Referer (sic) header
that was sent when fetching the current
page.
Typically user agents are configured to not report
referrers in the case where the referrer uses an encrypted protocol
and the current page does not (e.g. when navigating from an https: page to an http:
page).
cookie [ = value ]Returns the HTTP cookies that apply to the
Document. If there are no cookies or cookies can't be
applied to this resource, the empty string will be returned.
Can be set, to add a new cookie to the element's set of HTTP cookies.
If the contents are sandboxed into a unique origin (in an
iframe with the sandbox attribute), a
SecurityError exception will be thrown on getting and
setting.
lastModifiedReturns the date of the last modification to the document, as
reported by the server, in the form "MM/DD/YYYY hh:mm:ss", in the user's local
time zone.
If the last modification date is not known, the current time is returned instead.
readyStateReturns "loading" while the Document is loading, "interactive" once it is finished parsing but still loading sub-resources, and "complete" once it has loaded.
The readystatechange event fires on the Document object when this value changes.
The html element of a document is the
document's root element, if there is one and it's an
html element, or null otherwise.
headReturns the head element.
The head element of a document is the
first head element that is a child of the
html element, if there is one, or null
otherwise.
title [ = value ]Returns the document's title, as given by the
title element.
Can be set, to update the document's title. If there is no
head element,
the new value is ignored.
In SVG documents, the SVGDocument interface's
title attribute takes
precedence.
The title element of a document is the
first title element in the document (in tree order), if
there is one, or null otherwise.
body [ = value ]Returns the body element.
Can be set, to replace the body element.
If the new value is not a body or frameset element, this will throw a HierarchyRequestError exception.
The body element of a document is the first child of
the html element that is either a
body element or a frameset element. If
there is no such element, it is null.
imagesReturns an HTMLCollection of the img elements in the Document.
embedspluginsReturn an HTMLCollection of the embed elements in the Document.
linksReturns an HTMLCollection of the a and area elements in the Document that have href attributes.
formsReturn an HTMLCollection of the form elements in the Document.
scriptsReturn an HTMLCollection of the script elements in the Document.
getElementsByName(name)Returns a NodeList of elements in the
Document that have a name
attribute with the value name.
The dir
attribute on the Document interface is defined
along with the dir content
attribute.
Elements, attributes, and attribute values in HTML are defined
(by this specification) to have certain meanings (semantics). For
example, the ol element represents an ordered list, and
the lang attribute represents the
language of the content.
These definitions allow HTML processors, such as Web browsers or search engines, to present and use documents and applications in a wide variety of contexts that the author might not have considered.
As a simple example, consider a Web page written by an author who only considered desktop computer Web browsers. Because HTML conveys meaning, rather than presentation, the same page can also be used by a small browser on a mobile phone, without any change to the page. Instead of headings being in large letters as on the desktop, for example, the browser on the mobile phone might use the same size text for the whole the page, but with the headings in bold.
But it goes further than just differences in screen size: the same page could equally be used by a blind user using a browser based around speech synthesis, which instead of displaying the page on a screen, reads the page to the user, e.g. using headphones. Instead of large text for the headings, the speech browser might use a different volume or a slower voice.
That's not all, either. Since the browsers know which parts of the page are the headings, they can create a document outline that the user can use to quickly navigate around the document, using keys for "jump to next heading" or "jump to previous heading". Such features are especially common with speech browsers, where users would otherwise find quickly navigating a page quite difficult.
Even beyond browsers, software can make use of this information. Search engines can use the headings to more effectively index a page, or to provide quick links to subsections of the page from their results. Tools can use the headings to create a table of contents (that is in fact how this very specification's table of contents is generated).
This example has focused on headings, but the same principle applies to all of the semantics in HTML.
Authors must not use elements, attributes, or attribute values for purposes other than their appropriate intended semantic purpose, as doing so prevents software from correctly processing the page.
For example, the following document is non-conforming, despite being syntactically correct:
<!DOCTYPE HTML>
<html lang="en-GB">
<head> <title> Demonstration </title> </head>
<body>
<table>
<tr> <td> My favourite animal is the cat. </td> </tr>
<tr>
<td>
—<a href="http://example.org/~ernest/"><cite>Ernest</cite></a>,
in an essay from 1992
</td>
</tr>
</table>
</body>
</html>
...because the data placed in the cells is clearly not tabular
data (and the cite element mis-used). This would make
software that relies on these semantics fail: for example, a speech
browser that allowed a blind user to navigate tables in the
document would report the quote above as a table, confusing the
user; similarly, a tool that extracted titles of works from pages
would extract "Ernest" as the title of a work, even though it's
actually a person's name, not a title.
A corrected version of this document might be:
<!DOCTYPE HTML> <html lang="en-GB"> <head> <title> Demonstration </title> </head> <body> <blockquote> <p> My favourite animal is the cat. </p> </blockquote> <p> —<a href="http://example.org/~ernest/">Ernest</a>, in an essay from 1992 </p> </body> </html>
This next document fragment, intended to represent the heading of a corporate site, is similarly non-conforming because the second line is not intended to be a heading of a subsection, but merely a subheading or subtitle (a subordinate heading for the same section).
<body> <h1>ABC Company</h1> <h2>Leading the way in widget design since 1432</h2> ...
The hgroup element is intended for these kinds of
situations:
<body> <hgroup> <h1>ABC Company</h1> <h2>Leading the way in widget design since 1432</h2> </hgroup> ...
Authors must not use elements, attributes, or attribute values that are not permitted by this specification or other applicable specifications, as doing so makes it significantly harder for the language to be extended in the future.
In the next example, there is a non-conforming attribute value ("carpet") and a non-conforming attribute ("texture"), which is not permitted by this specification:
<label>Carpet: <input type="carpet" name="c" texture="deep pile"></label>
Here would be an alternative and correct way to mark this up:
<label>Carpet: <input type="text" class="carpet" name="c" data-texture="deep pile"></label>
Through scripting and using other mechanisms, the values of attributes, text, and indeed the entire structure of the document may change dynamically while a user agent is processing it. The semantics of a document at an instant in time are those represented by the state of the document at that instant in time, and the semantics of a document can therefore change over time. User agents update their presentation of the document as this occurs.
HTML has a progress element that
describes a progress bar. If its "value" attribute is dynamically
updated by a script, the UA would update the rendering to show the
progress changing.
The nodes representing HTML elements in the DOM implement, and expose to scripts, the interfaces listed for them in the relevant sections of this specification. This includes HTML elements in XML documents, even when those documents are in another context (e.g. inside an XSLT transform).
Elements in the DOM represent things; that is, they have intrinsic meaning, also known as semantics.
For example, an ol element
represents an ordered list.
The basic interface, from which all the HTML
elements' interfaces inherit, is the HTMLElement interface.
interface HTMLElement : Element {
// metadata attributes
attribute DOMString title;
attribute DOMString lang;
attribute boolean translate;
attribute DOMString dir;
attribute DOMString className;
readonly attribute DOMTokenList classList;
readonly attribute DOMStringMap dataset;
// user interaction
attribute boolean hidden;
void click();
attribute long tabIndex;
void focus();
void blur();
attribute DOMString accessKey;
readonly attribute DOMString accessKeyLabel;
attribute boolean draggable;
[PutForwards=value] readonly attribute DOMSettableTokenList dropzone;
attribute DOMString contentEditable;
readonly attribute boolean isContentEditable;
attribute HTMLMenuElement? contextMenu;
attribute boolean spellcheck;
// command API
readonly attribute DOMString? commandType;
readonly attribute DOMString? commandLabel;
readonly attribute DOMString? commandIcon;
readonly attribute boolean? commandHidden;
readonly attribute boolean? commandDisabled;
readonly attribute boolean? commandChecked;
// styling
readonly attribute CSSStyleDeclaration style;
// event handler IDL attributes
[TreatNonCallableAsNull] attribute Function? onabort;
[TreatNonCallableAsNull] attribute Function? onblur;
[TreatNonCallableAsNull] attribute Function? oncanplay;
[TreatNonCallableAsNull] attribute Function? oncanplaythrough;
[TreatNonCallableAsNull] attribute Function? onchange;
[TreatNonCallableAsNull] attribute Function? onclick;
[TreatNonCallableAsNull] attribute Function? oncontextmenu;
[TreatNonCallableAsNull] attribute Function? oncuechange;
[TreatNonCallableAsNull] attribute Function? ondblclick;
[TreatNonCallableAsNull] attribute Function? ondrag;
[TreatNonCallableAsNull] attribute Function? ondragend;
[TreatNonCallableAsNull] attribute Function? ondragenter;
[TreatNonCallableAsNull] attribute Function? ondragleave;
[TreatNonCallableAsNull] attribute Function? ondragover;
[TreatNonCallableAsNull] attribute Function? ondragstart;
[TreatNonCallableAsNull] attribute Function? ondrop;
[TreatNonCallableAsNull] attribute Function? ondurationchange;
[TreatNonCallableAsNull] attribute Function? onemptied;
[TreatNonCallableAsNull] attribute Function? onended;
[TreatNonCallableAsNull] attribute Function? onerror;
[TreatNonCallableAsNull] attribute Function? onfocus;
[TreatNonCallableAsNull] attribute Function? oninput;
[TreatNonCallableAsNull] attribute Function? oninvalid;
[TreatNonCallableAsNull] attribute Function? onkeydown;
[TreatNonCallableAsNull] attribute Function? onkeypress;
[TreatNonCallableAsNull] attribute Function? onkeyup;
[TreatNonCallableAsNull] attribute Function? onload;
[TreatNonCallableAsNull] attribute Function? onloadeddata;
[TreatNonCallableAsNull] attribute Function? onloadedmetadata;
[TreatNonCallableAsNull] attribute Function? onloadstart;
[TreatNonCallableAsNull] attribute Function? onmousedown;
[TreatNonCallableAsNull] attribute Function? onmousemove;
[TreatNonCallableAsNull] attribute Function? onmouseout;
[TreatNonCallableAsNull] attribute Function? onmouseover;
[TreatNonCallableAsNull] attribute Function? onmouseup;
[TreatNonCallableAsNull] attribute Function? onmousewheel;
[TreatNonCallableAsNull] attribute Function? onpause;
[TreatNonCallableAsNull] attribute Function? onplay;
[TreatNonCallableAsNull] attribute Function? onplaying;
[TreatNonCallableAsNull] attribute Function? onprogress;
[TreatNonCallableAsNull] attribute Function? onratechange;
[TreatNonCallableAsNull] attribute Function? onreset;
[TreatNonCallableAsNull] attribute Function? onscroll;
[TreatNonCallableAsNull] attribute Function? onseeked;
[TreatNonCallableAsNull] attribute Function? onseeking;
[TreatNonCallableAsNull] attribute Function? onselect;
[TreatNonCallableAsNull] attribute Function? onshow;
[TreatNonCallableAsNull] attribute Function? onstalled;
[TreatNonCallableAsNull] attribute Function? onsubmit;
[TreatNonCallableAsNull] attribute Function? onsuspend;
[TreatNonCallableAsNull] attribute Function? ontimeupdate;
[TreatNonCallableAsNull] attribute Function? onvolumechange;
[TreatNonCallableAsNull] attribute Function? onwaiting;
};
interface HTMLUnknownElement : HTMLElement { };The HTMLElement interface holds methods and
attributes related to a number of disparate features, and the
members of this interface are therefore described in various
different sections of this specification.
The following attributes are common to and may be specified on all HTML elements:
accesskeyclasscontenteditablecontextmenudirdraggabledropzoneidlangspellcheckstyletabindextitletranslateThe following event handler content attributes may be specified on any HTML element:
onabortonblur*oncanplayoncanplaythroughonchangeonclickoncontextmenuoncuechangeondblclickondragondragendondragenterondragleaveondragoverondragstartondropondurationchangeonemptiedonendedonerror*onfocus*oninputoninvalidonkeydownonkeypressonkeyuponload*onloadeddataonloadedmetadataonloadstartonmousedownonmousemoveonmouseoutonmouseoveronmouseuponmousewheelonpauseonplayonplayingonprogressonratechangeonresetonscroll*onseekedonseekingonselectonshowonstalledonsubmitonsuspendontimeupdateonvolumechangeonwaitingThe attributes marked with an asterisk have a
different meaning when specified on body elements as
those elements expose event handlers of the
Window object with the same names.
While these attributes apply to all elements, they
are not useful on all elements. For example, only media elements will ever receive a volumechange event fired by
the user agent.
Custom data attributes
(e.g. data-foldername or data-msgid) can be specified on any HTML element, to store custom data
specific to the page.
In HTML documents, elements in the HTML
namespace may have an xmlns attribute
specified, if, and only if, it has the exact value
"http://www.w3.org/1999/xhtml". This does not apply to
XML documents.
In HTML, the xmlns attribute
has absolutely no effect. It is basically a talisman. It is allowed
merely to make migration to and from XHTML mildly easier. When
parsed by an HTML parser, the attribute ends up in no
namespace, not the "http://www.w3.org/2000/xmlns/"
namespace like namespace declaration attributes in XML do.
In XML, an xmlns attribute is
part of the namespace declaration mechanism, and an element cannot
actually have an xmlns attribute in no
namespace specified.
The XML specification also allows the use of the xml:space attribute in the XML
namespace on any element in an XML document. This attribute has no effect on
HTML elements, as the default behavior in HTML is to
preserve whitespace. [XML]
There is no way to serialize the xml:space attribute on HTML
elements in the text/html syntax.
To enable assistive technology products to expose a more
fine-grained interface than is otherwise possible with HTML elements
and attributes, a set of annotations for
assistive technology products can be specified (the ARIA
role and aria-* attributes).
id attributeThe id attribute specifies its
element's unique identifier (ID). [DOMCORE]
The value must be unique amongst all the IDs in the element's home subtree and must contain at least one character. The value must not contain any space characters.
An element's unique identifier can be used for a variety of purposes, most notably as a way to link to specific parts of a document using fragment identifiers, as a way to target an element when scripting, and as a way to style a specific element from CSS.
title attributeThe title attribute
represents advisory information for the element, such
as would be appropriate for a tooltip. On a link, this could be the
title or a description of the target resource; on an image, it could
be the image credit or a description of the image; on a paragraph,
it could be a footnote or commentary on the text; on a citation, it
could be further information about the source; and so forth. The
value is text.
If this attribute is omitted from an element, then it implies
that the title attribute of the
nearest ancestor HTML element
with a title attribute set is also
relevant to this element. Setting the attribute overrides this,
explicitly stating that the advisory information of any ancestors is
not relevant to this element. Setting the attribute to the empty
string indicates that the element has no advisory information.
If the title attribute's value
contains U+000A LINE FEED (LF) characters, the content is split into
multiple lines. Each U+000A LINE FEED (LF) character represents a
line break.
Caution is advised with respect to the use of newlines in title attributes.
For instance, the following snippet actually defines an abbreviation's expansion with a line break in it:
<p>My logs show that there was some interest in <abbr title="Hypertext Transport Protocol">HTTP</abbr> today.</p>
Some elements, such as link, abbr, and
input, define additional semantics for the title attribute beyond the semantics
described above.
The title IDL attribute
must reflect the title
content attribute.
lang and xml:lang attributesThe lang attribute (in
no namespace) specifies the primary language for the element's
contents and for any of the element's attributes that contain
text. Its value must be a valid BCP 47 language tag, or the empty
string. Setting the attribute to the empty string indicates that the
primary language is unknown. [BCP47]
The lang
attribute in the XML namespace is defined in XML. [XML]
If these attributes are omitted from an element, then the language of this element is the same as the language of its parent element, if any.
The lang attribute in no namespace
may be used on any HTML
element.
The lang
attribute in the XML namespace may be used on
HTML elements in XML documents, as well as
elements in other namespaces if the relevant specifications allow it
(in particular, MathML and SVG allow lang attributes in the
XML namespace to be specified on their
elements). If both the lang attribute
in no namespace and the lang attribute in the XML
namespace are specified on the same element, they must
have exactly the same value when compared in an ASCII
case-insensitive manner.
Authors must not use the lang attribute in the XML
namespace on HTML elements in HTML
documents. To ease migration to and from XHTML, authors may
specify an attribute in no namespace with no prefix and with the
literal localname "xml:lang" on HTML
elements in HTML documents, but such attributes
must only be specified if a lang
attribute in no namespace is also specified, and both attributes
must have the same value when compared in an ASCII
case-insensitive manner.
The attribute in no namespace with no prefix and
with the literal localname "xml:lang" has no
effect on language processing.
The lang IDL attribute
must reflect the lang
content attribute in no namespace.
translate attributeThe translate
attribute is an enumerated attribute that is used to
specify whether an element's attribute values and the values of its
Text node children are to be translated when the page
is localized, or whether to leave them unchanged.
The attribute's keywords are the empty string, yes, and no. The empty string
and the yes keyword map to the yes
state. The no keyword maps to the no
state. In addition, there is a third state, the inherit
state, which is the missing value default (and the invalid
value default).
Each element has a translation mode, which is in
either the translate-enabled state or the
no-translate state. If the element's translate attribute is in the
yes state, then the element's translation mode
is in the translate-enabled state. Otherwise, if the
element's translate attribute is
in the no state, then the element's translation
mode is in the no-translate state. Otherwise,
the element's translate
attribute is in the inherit state; in that case, the
element's translation mode is in the same state as its
parent element, if any, or in the translate-enabled
state, if the element is a root element.
When an element is in the translate-enabled state, the
element's attribute values and the values of its Text
node children are to be translated when the page is localized.
When an element is in the no-translate state, the
element's attribute values and the values of its Text
node children are to be left as-is when the page is localized, e.g.
because the element contains a person's name or a the name of a
computer program.
In this example, everything in the document is to be translated when the page is localised, except the sample keyboard input and sample program output:
<!DOCTYPE HTML> <html> <!-- default on the root element is translate=yes --> <head> <title>The Bee Game</title> <!-- implied translate=yes inherited from ancestors --> </head> <body> <p>The Bee Game is a text adventure game in English.</p> <p>When the game launches, the first thing you should do is type <kbd translate=no>eat honey</kbd>. The game will respond with:</p> <pre><samp translate=no>Yum yum! That was some good honey!</samp></pre> </body> </html>
xml:base
attribute (XML only)The xml:base attribute is
defined in XML Base. [XMLBASE]
The xml:base attribute may be
used on elements of XML documents. Authors must not
use the xml:base attribute in
HTML documents.
dir attributeThe dir attribute specifies the
element's text directionality. The attribute is an enumerated
attribute with the following keywords and states:
ltr keyword, which maps to the ltr stateIndicates that the contents of the element are explicitly directionally embedded left-to-right text.
rtl keyword, which maps to the rtl stateIndicates that the contents of the element are explicitly directionally embedded right-to-left text.
auto keyword, which maps to the auto stateIndicates that the contents of the element are explicitly embedded text, but that the direction is to be determined programmatically using the contents of the element (as described below).
The heuristic used by this state is very crude (it just looks at the first character with a strong directionality, in a manner analogous to the Paragraph Level determination in the bidirectional algorithm). Authors are urged to only use this value as a last resort when the direction of the text is truly unknown and no better server-side heuristic can be applied.
For textarea and pre
elements, the heuristic is applied on a per-paragraph level.
The attribute has no invalid value default and no missing value default.
The directionality of an element is either 'ltr' or 'rtl', and is determined as per the first appropriate set of steps from the following list:
dir attribute is
in the ltr stateThe directionality of the element is 'ltr'.
dir attribute is
in the rtl stateThe directionality of the element is 'rtl'.
input element whose type attribute is in the Text, Search, Telephone, URL, or E-mail state, and the dir attribute is in the auto statetextarea element and the dir attribute is in the auto stateIf the element's value contains a character of bidirectional character type AL or R, and there is no character of bidirectional character type L anywhere before it in the element's value, then the directionality of the element is 'rtl'. Otherwise, the directionality of the element is 'ltr'.
dir attribute is
in the auto statebdi element and the dir attribute is not in a defined state
(i.e. it is not present or has an invalid value)Find the first character in tree order that matches the following criteria:
The character is from a Text node that is a
descendant of the element whose directionality is being
determined.
The character is of bidirectional character type L, AL, or R. [BIDI]
The character is not in a Text node that has an
ancestor element that is a descendant of the element whose directionality is being
determined and that is either:
If such a character is found and it is of bidirectional character type AL or R, the directionality of the element is 'rtl'.
Otherwise, the directionality of the element is 'ltr'.
dir attribute is not in a defined state
(i.e. it is not present or has an invalid value)The directionality of the element is 'ltr'.
dir attribute is not in a defined state
(i.e. it is not present or has an invalid value)The directionality of the element is the same as the element's parent element's directionality.
The effect of this attribute is primarily on the presentation layer. For example, the rendering section in this specification defines a mapping from this attribute to the CSS 'direction' and 'unicode-bidi' properties, and CSS defines rendering in terms of those properties.
dir [ = value ]Returns the html element's dir attribute's value, if any.
Can be set, to either "ltr", "rtl", or "auto" to replace the html element's dir attribute's value.
If there is no html element, returns the empty string and ignores new values.
The dir IDL attribute on
an element must reflect the dir content attribute of that element,
limited to only known values.
The dir IDL
attribute on Document objects must
reflect the dir content
attribute of the html element, if any,
limited to only known values. If there is no such
element, then the attribute must return the empty string and do
nothing on setting.
Authors are strongly encouraged to use the dir attribute to indicate text direction
rather than using CSS, since that way their documents will continue
to render correctly even in the absence of CSS (e.g. as interpreted
by search engines).
This markup fragment is of an IM conversation.
<p dir=auto class="u1"><b><bdi>Student</bdi>:</b> How do you write "What's your name?" in Arabic?</p> <p dir=auto class="u2"><b><bdi>Teacher</bdi>:</b> ما اسمك؟</p> <p dir=auto class="u1"><b><bdi>Student</bdi>:</b> Thanks.</p> <p dir=auto class="u2"><b><bdi>Teacher</bdi>:</b> That's written "شكرًا".</p> <p dir=auto class="u2"><b><bdi>Teacher</bdi>:</b> Do you know how to write "Please"?</p> <p dir=auto class="u1"><b><bdi>Student</bdi>:</b> "من فضلك", right?</p>
Given a suitable style sheet and the default alignment styles
for the p element, namely to align the text to the
start edge of the paragraph, the resulting rendering could
be as follows:

As noted earlier, the auto
value is not a panacea. The final paragraph in this example is
misinterpreted as being right-to-left text, since it begins with an
Arabic character, which causes the "right?" to be to the left of
the Arabic text.
class attributeEvery HTML element may have a
class attribute specified.
The attribute, if specified, must have a value that is a set of space-separated tokens representing the various classes that the element belongs to.
Assigning classes to an element affects class
matching in selectors in CSS, the getElementsByClassName()
method in the DOM, and other such features.
There are no additional restrictions on the tokens authors can
use in the class attribute, but
authors are encouraged to use values that describe the nature of the
content, rather than values that describe the desired presentation
of the content.
The className and
classList IDL
attributes must both reflect the class content attribute.
style attributeAll HTML elements may have the style content attribute set. This is a
CSS styling attribute as defined by the CSS Styling
Attribute Syntax specification. [CSSATTR]
Documents that use style
attributes on any of their elements must still be comprehensible and
usable if those attributes were removed.
In particular, using the style attribute to hide and show content,
or to convey meaning that is otherwise not included in the document,
is non-conforming. (To hide and show content, use the attribute.)
styleReturns a CSSStyleDeclaration object for the element's style attribute.
In the following example, the words that refer to colors are
marked up using the span element and the style attribute to make those words show
up in the relevant colors in visual media.
<p>My sweat suit is <span style="color: green; background: transparent">green</span> and my eyes are <span style="color: blue; background: transparent">blue</span>.</p>
data-* attributesA custom data attribute is an attribute in no
namespace whose name starts with the string "data-", has at least one
character after the hyphen, is XML-compatible, and
contains no characters in the range U+0041 to U+005A (LATIN CAPITAL
LETTER A to LATIN CAPITAL LETTER Z).
All attributes on HTML elements in HTML documents get ASCII-lowercased automatically, so the restriction on ASCII uppercase letters doesn't affect such documents.
Custom data attributes are intended to store custom data private to the page or application, for which there are no more appropriate attributes or elements.
These attributes are not intended for use by software that is independent of the site that uses the attributes.
For instance, a site about music could annotate list items representing tracks in an album with custom data attributes containing the length of each track. This information could then be used by the site itself to allow the user to sort the list by track length, or to filter the list for tracks of certain lengths.
<ol> <li data-length="2m11s">Beyond The Sea</li> ... </ol>
It would be inappropriate, however, for the user to use generic software not associated with that music site to search for tracks of a certain length by looking at this data.
This is because these attributes are intended for use by the site's own scripts, and are not a generic extension mechanism for publicly-usable metadata.
Every HTML element may have any number of custom data attributes specified, with any value.
datasetReturns a DOMStringMap object for the element's data-* attributes.
Hyphenated names become camel-cased. For example, data-foo-bar="" becomes element.dataset.fooBar.
If a Web page wanted an element to represent a space ship,
e.g. as part of a game, it would have to use the class attribute along with data-* attributes:
<div class="spaceship" data-ship-id="92432"
data-weapons="laser 2" data-shields="50%"
data-x="30" data-y="10" data-z="90">
<button class="fire"
onclick="spaceships[this.parentNode.dataset.shipId].fire()">
Fire
</button>
</div>
Notice how the hyphenated attribute name becomes camel-cased in the API.
Authors should carefully design such extensions so that when the attributes are ignored and any associated CSS dropped, the page is still usable.
JavaScript libraries may use the custom data attributes, as they are considered to be part of the page on which they are used. Authors of libraries that are reused by many authors are encouraged to include their name in the attribute names, to reduce the risk of clashes. Where it makes sense, library authors are also encouraged to make the exact name used in the attribute names customizable, so that libraries whose authors unknowingly picked the same name can be used on the same page, and so that multiple versions of a particular library can be used on the same page even when those versions are not mutually compatible.
For example, a library called "DoQuery" could use attribute
names like data-doquery-range, and a library
called "jJo" could use attributes names like data-jjo-range. The jJo library could also provide
an API to set which prefix to use (e.g. J.setDataPrefix('j2'), making the attributes have
names like data-j2-range).
Each element in this specification has a definition that includes the following information:
A list of categories to which the element belongs. These are used when defining the content models for each element.
A non-normative description of where the element can be used. This information is redundant with the content models of elements that allow this one as a child, and is provided only as a convenience.
For simplicity, only the most specific expectations are listed. For example, an element that is both flow content and phrasing content can be used anywhere that either flow content or phrasing content is expected, but since anywhere that flow content is expected, phrasing content is also expected (since all phrasing content is flow content), only "where phrasing content is expected" will be listed.
A normative description of what content must be included as children and descendants of the element.
A normative list of attributes that may be specified on the element (except where otherwise disallowed).
A normative definition of a DOM interface that such elements must implement.
This is then followed by a description of what the element represents, along with any additional normative conformance criteria that may apply to authors. Examples are sometimes also included.
Except where otherwise specified, attributes on HTML elements may have any string value, including the empty string. Except where explicitly stated, there is no restriction on what text can be specified in such attributes.
Each element defined in this specification has a content model: a description of the element's expected contents. An HTML element must have contents that match the requirements described in the element's content model.
The space characters are
always allowed between elements. User agents represent these
characters between elements in the source markup as
Text nodes in the DOM. Empty Text nodes and
Text nodes consisting of just sequences of those
characters are considered inter-element whitespace.
Inter-element whitespace, comment nodes, and processing instruction nodes must be ignored when establishing whether an element's contents match the element's content model or not, and must be ignored when following algorithms that define document and element semantics.
Thus, an element A is said to be
preceded or followed by a second element B if A and B
have the same parent node and there are no other element nodes or
Text nodes (other than inter-element
whitespace) between them. Similarly, a node is the only
child of an element if that element contains no other nodes
other than inter-element whitespace, comment nodes, and
processing instruction nodes.
Authors must not use HTML elements anywhere except where they are explicitly allowed, as defined for each element, or as explicitly required by other specifications. For XML compound documents, these contexts could be inside elements from other namespaces, if those elements are defined as providing the relevant contexts.
For example, the Atom specification defines a content element. When its type attribute has the value xhtml, the Atom specification requires that it
contain a single HTML div element. Thus, a
div element is allowed in that context, even though
this is not explicitly normatively stated by this specification. [ATOM]
In addition, HTML elements may be orphan nodes (i.e. without a parent node).
For example, creating a td element and storing it
in a global variable in a script is conforming, even though
td elements are otherwise only supposed to be used
inside tr elements.
var data = {
name: "Banana",
cell: document.createElement('td'),
};
Each element in HTML falls into zero or more categories that group elements with similar characteristics together. The following broad categories are used in this specification:
Some elements also fall into other categories, which are defined in other parts of this specification.
These categories are related as follows:
Sectioning content, heading content, phrasing content, embedded content, and interactive content are all types of flow content. Metadata is sometimes flow content. Metadata and interactive content are sometimes phrasing content. Embedded content is also a type of phrasing content, and sometimes is interactive content.
Other categories are also used for specific purposes, e.g. form controls are specified using a number of categories to define common requirements. Some elements have unique requirements and do not fit into any particular category.
Metadata content is content that sets up the presentation or behavior of the rest of the content, or that sets up the relationship of the document with other documents, or that conveys other "out of band" information.
Elements from other namespaces whose semantics are primarily metadata-related (e.g. RDF) are also metadata content.
Thus, in the XML serialization, one can use RDF, like this:
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:r="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<head>
<title>Hedral's Home Page</title>
<r:RDF>
<Person xmlns="http://www.w3.org/2000/10/swap/pim/contact#"
r:about="http://hedral.example.com/#">
<fullName>Cat Hedral</fullName>
<mailbox r:resource="mailto:hedral@damowmow.com"/>
<personalTitle>Sir</personalTitle>
</Person>
</r:RDF>
</head>
<body>
<h1>My home page</h1>
<p>I like playing with string, I guess. Sister says squirrels are fun
too so sometimes I follow her to play with them.</p>
</body>
</html>
This isn't possible in the HTML serialization, however.
Most elements that are used in the body of documents and applications are categorized as flow content.
aabbraddressarea (if it is a descendant of a map element)articleasideaudiobbdibdoblockquotebrbuttoncanvascitecodecommanddatalistdeldetailsdfndivdlemembedfieldsetfigurefooterformh1h2h3h4h5h6headerhgrouphriiframeimginputinskbdkeygenlabelmapmarkmathmenumeternavnoscriptobjectoloutputppreprogressqrubyssampscriptsectionselectsmallspanstrongstyle (if the scoped attribute is present)subsupsvgtabletextareatimeuulvarvideowbrSectioning content is content that defines the scope of headings and footers.
Each sectioning content element potentially has a heading and an outline. See the section on headings and sections for further details.
There are also certain elements that are sectioning roots. These are distinct from sectioning content, but they can also have an outline.
Heading content defines the header of a section (whether explicitly marked up using sectioning content elements, or implied by the heading content itself).
Phrasing content is the text of the document, as well as elements that mark up that text at the intra-paragraph level. Runs of phrasing content form paragraphs.
a (if it contains only phrasing content)abbrarea (if it is a descendant of a map element)audiobbdibdobrbuttoncanvascitecodecommanddatalistdel (if it contains only phrasing content)dfnemembediiframeimginputins (if it contains only phrasing content)kbdkeygenlabelmap (if it contains only phrasing content)markmathmeternoscriptobjectoutputprogressqrubyssampscriptselectsmallspanstrongsubsupsvgtextareatimeuvarvideowbrAs a general rule, elements whose content model allows any
phrasing content should have either at least one
descendant Text node that is not inter-element
whitespace, or at least one descendant element node that is
embedded content. For the purposes of this requirement,
nodes that are descendants of del elements must not be
counted as contributing to the ancestors of the del
element.
Most elements that are categorized as phrasing content can only contain elements that are themselves categorized as phrasing content, not any flow content.
Text, in the context of content
models, means Text nodes. Text is sometimes used as a content model on its
own, but is also phrasing content, and can be
inter-element whitespace (if the Text
nodes are empty or contain just space
characters).
Text nodes and attribute values must consist of
Unicode characters, must not
contain U+0000 characters, must not contain permanently undefined
Unicode characters (noncharacters), and must not contain control
characters other than space
characters.
This specification includes extra constraints on the exact value of
Text nodes and attribute values depending on their
precise context.
Embedded content is content that imports another resource into the document, or content from another vocabulary that is inserted into the document.
Elements that are from namespaces other than the HTML namespace and that convey content but not metadata, are embedded content for the purposes of the content models defined in this specification. (For example, MathML, or SVG.)
Some embedded content elements can have fallback content: content that is to be used when the external resource cannot be used (e.g. because it is of an unsupported format). The element definitions state what the fallback is, if any.
Interactive content is content that is specifically intended for user interaction.
aaudio (if the controls attribute is present)buttondetailsembediframeimg (if the usemap attribute is present)input (if the type attribute is not in the state)keygenlabelmenu (if the type attribute is in the toolbar state)object (if the usemap attribute is present)selecttextareavideo (if the controls attribute is present)Certain elements in HTML have an activation
behavior, which means that the user can activate them. This
triggers a sequence of events dependent on the activation mechanism,
and normally culminating in a click
event.
As a general rule, elements whose content model allows any
flow content or phrasing content should
have at least one child node that is palpable content
and that does not have the
attribute specified.
This requirement is not a hard requirement, however, as there are many cases where an element can be empty legitimately, for example when it is used as a placeholder which will later be filled in by a script, or when the element is part of a template and would on most pages be filled in but on some pages is not relevant.
Conformance checkers are encouraged to provide a mechanism for authors to find elements that fail to fulfill this requirement, as an authoring aid.
The following elements are palpable content:
aabbraddressarticleasideaudio (if the controls attribute is present)bbdibdoblockquotebuttoncanvascitecodedetailsdfndivdl (if the element's children include at least one name-value group)emembedfieldsetfigurefooterformh1h2h3h4h5h6headerhgroupiiframeimginput (if the type attribute is not in the state)inskbdkeygenlabelmapmarkmathmenu (if the type attribute is in the toolbar state or the list state)meternavobjectol (if the element's children include at least one li element)outputppreprogressqrubyssampsectionselectsmallspanstrongsubsupsvgtabletextareatimeuul (if the element's children include at least one li element)varvideoSome elements are described as transparent; they have "transparent" in the description of their content model. The content model of a transparent element is derived from the content model of its parent element: the elements required in the part of the content model that is "transparent" are the same elements as required in the part of the content model of the parent of the transparent element in which the transparent element finds itself.
For instance, an ins element inside a
ruby element cannot contain an rt
element, because the part of the ruby element's
content model that allows ins elements is the part
that allows phrasing content, and the rt
element is not phrasing content.
In some cases, where transparent elements are nested in each other, the process has to be applied iteratively.
Consider the following markup fragment:
<p><object><param><ins><map><a href="/">Apples</a></map></ins></object></p>
To check whether "Apples" is allowed inside the a
element, the content models are examined. The a
element's content model is transparent, as is the map
element's, as is the ins element's, as is the part of
the object element's in which the ins
element is found. The object element is found in the
p element, whose content model is phrasing
content. Thus, "Apples" is allowed, as text is phrasing
content.
When a transparent element has no parent, then the part of its content model that is "transparent" must instead be treated as accepting any flow content.
The term paragraph as defined in this
section is used for more than just the definition of the
p element. The paragraph concept defined
here is used to describe how to interpret documents. The
p element is merely one of several ways of marking up a
paragraph.
A paragraph is typically a run of phrasing content that forms a block of text with one or more sentences that discuss a particular topic, as in typography, but can also be used for more general thematic grouping. For instance, an address is also a paragraph, as is a part of a form, a byline, or a stanza in a poem.
In the following example, there are two paragraphs in a section. There is also a heading, which contains phrasing content that is not a paragraph. Note how the comments and inter-element whitespace do not form paragraphs.
<section> <h1>Example of paragraphs</h1> This is the <em>first</em> paragraph in this example. <p>This is the second.</p> <!-- This is not a paragraph. --> </section>
Paragraphs in flow content are defined relative to
what the document looks like without the a,
ins, del, and map elements
complicating matters, since those elements, with their hybrid
content models, can straddle paragraph boundaries, as shown in the
first two examples below.
Generally, having elements straddle paragraph boundaries is best avoided. Maintaining such markup can be difficult.
The following example takes the markup from the earlier example
and puts ins and del elements around some
of the markup to show that the text was changed (though in this
case, the changes admittedly don't make much sense). Notice how
this example has exactly the same paragraphs as the previous one,
despite the ins and del elements —
the ins element straddles the heading and the first
paragraph, and the del element straddles the boundary
between the two paragraphs.
<section> <ins><h1>Example of paragraphs</h1> This is the <em>first</em> paragraph in</ins> this example<del>. <p>This is the second.</p></del> <!-- This is not a paragraph. --> </section>
A paragraph is also formed explicitly by
p elements.
The p element can be used to wrap
individual paragraphs when there would otherwise not be any content
other than phrasing content to separate the paragraphs from each
other.
In the following example, the link spans half of the first paragraph, all of the heading separating the two paragraphs, and half of the second paragraph. It straddles the paragraphs and the heading.
<header> Welcome! <a href="about.html"> This is home of... <h1>The Falcons!</h1> The Lockheed Martin multirole jet fighter aircraft! </a> This page discusses the F-16 Fighting Falcon's innermost secrets. </header>
Here is another way of marking this up, this time showing the paragraphs explicitly, and splitting the one link element into three:
<header> <p>Welcome! <a href="about.html">This is home of...</a></p> <h1><a href="about.html">The Falcons!</a></h1> <p><a href="about.html">The Lockheed Martin multirole jet fighter aircraft!</a> This page discusses the F-16 Fighting Falcon's innermost secrets.</p> </header>
It is possible for paragraphs to overlap when using certain elements that define fallback content. For example, in the following section:
<section> <h1>My Cats</h1> You can play with my cat simulator. <object data="cats.sim"> To see the cat simulator, use one of the following links: <ul> <li><a href="cats.sim">Download simulator file</a> <li><a href="http://sims.example.com/watch?v=LYds5xY4INU">Use online simulator</a> </ul> Alternatively, upgrade to the Mellblom Browser. </object> I'm quite proud of it. </section>
There are five paragraphs:
object element.The first paragraph is overlapped by the other four. A user agent that supports the "cats.sim" resource will only show the first one, but a user agent that shows the fallback will confusingly show the first sentence of the first paragraph as if it was in the same paragraph as the second one, and will show the last paragraph as if it was at the start of the second sentence of the first paragraph.
To avoid this confusion, explicit p elements can be
used.
Text content in HTML elements with
child Text nodes, and text in attributes of HTML
elements that allow free-form text, may contain characters in
the range U+202A to U+202E (the bidirectional-algorithm formatting
characters). However, the use of these characters is restricted so
that any embedding or overrides generated by these characters do not
start and end with different parent elements, and so that all such
embeddings and overrides are explicitly terminated by a U+202C POP
DIRECTIONAL FORMATTING character. This helps reduce incidences of
text being reused in a manner that has unforeseen effects on the
bidirectional algorithm.
The aforementioned restrictions are defined by specifying that certain parts of documents form bidirectional-algorithm formatting character ranges, and then imposing a requirement on such ranges.
The strings resulting from applying the following algorithm to an HTML element element are bidirectional-algorithm formatting character ranges:
Let output be an empty list of strings.
Let string be an empty string.
Let node be the first child node of element, if any, or null otherwise.
Loop: If node is null, jump to the step labeled end.
Process node according to the first matching step from the following list:
Text nodeAppend the text data of node to string.
br elementIf string is not the empty string, push string onto output, and let string be empty string.
Let node be node's next sibling, if any, or null otherwise.
Jump to the step labeled loop.
End: If string is not the empty string, push string onto output.
Return output as the bidirectional-algorithm formatting character ranges.
The value of a namespace-less attribute of an HTML element is a bidirectional-algorithm formatting character range.
Any strings that, as described above, are
bidirectional-algorithm formatting character ranges must
match the string production in the following
ABNF, the character set for which is Unicode. [ABNF]
string = *( plaintext ( embedding / override ) ) plaintext
embedding = ( lre / rle ) string pdf
override = ( lro / rlo ) string pdf
lre = %x202A ; U+202A LEFT-TO-RIGHT EMBEDDING
rle = %x202B ; U+202B RIGHT-TO-LEFT EMBEDDING
lro = %x202D ; U+202D LEFT-TO-RIGHT OVERRIDE
rlo = %x202E ; U+202E RIGHT-TO-LEFT OVERRIDE
pdf = %x202C ; U+202C POP DIRECTIONAL FORMATTING
plaintext = *( %x0000-2029 / %x202F-10FFFF )
; any string with no bidirectional-algorithm formatting charactersAuthors are encouraged to use the dir attribute, the bdo element,
and the bdi element, rather than maintaining the
bidirectional-algorithm formatting characters manually. The
bidirectional-algorithm formatting characters interact poorly with
CSS.
Authors may use the ARIA role
and aria-* attributes on HTML
elements, in accordance with the requirements described in
the ARIA specifications, except where these conflict with the
strong native semantics
described below. These exceptions are intended to prevent authors
from making assistive technology products report nonsensical states
that do not represent the actual state of the document. [ARIA]
The following table defines the strong native semantics and corresponding default implicit ARIA semantics that apply to HTML elements. Each language feature (element or attribute) in a cell in the first column implies the ARIA semantics (role, states, and/or properties) given in the cell in the second column of the same row.
| Language feature | Strong native semantics and default implied ARIA semantics |
|---|---|
area element that creates a hyperlink
| link role
|
base element
| No role |
datalist element
| listbox role, with the aria-multiselectable property set to "false"
|
details element
| aria-expanded state set to "true" if the element's open attribute is present, and set to "false" otherwise
|
head element
| No role |
hgroup element
| heading role, with the aria-level property set to the element's outline depth
|
hr element
| separator role
|
html element
| No role |
img element whose alt attribute's value is empty
| presentation role
|
input element with a type attribute in the Checkbox state
| aria-checked state set to "mixed" if the element's indeterminate IDL attribute is true, or "true" if the element's checkedness is true, or "false" otherwise
|
input element with a type attribute in the Color state
| No role |
input element with a type attribute in the Date state
| No role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the Date and Time state
| No role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the Local Date and Time state
| No role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the E-mail state with no suggestions source element
| textbox role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the File Upload state
| No role |
input element with a type attribute in the state
| No role |
input element with a type attribute in the Month state
| No role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the Number state
| spinbutton role, with the aria-readonly property set to "true" if the element has a readonly attribute, the aria-valuemax property set to the element's maximum, the aria-valuemin property set to the element's minimum, and, if the result of applying the rules for parsing floating point number values to the element's value is a number, with the aria-valuenow property set to that number
|
input element with a type attribute in the Password state
| textbox role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the Radio Button state
| aria-checked state set to "true" if the element's checkedness is true, or "false" otherwise
|
input element with a type attribute in the Range state
| slider role, with the aria-valuemax property set to the element's maximum, the aria-valuemin property set to the element's minimum, and the aria-valuenow property set to the result of applying the rules for parsing floating point number values to the element's value, if that results in a number, or the default value otherwise
|
input element with a type attribute in the Reset Button state
| button role
|
input element with a type attribute in the Search state with no suggestions source element
| textbox role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the Submit Button state
| button role
|
input element with a type attribute in the Telephone state with no suggestions source element
| textbox role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the Text state with no suggestions source element
| textbox role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the Text, Search, Telephone, URL, or E-mail states with a suggestions source element
| combobox role, with the aria-owns property set to the same value as the list attribute, and the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the Time state
| No role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the URL state with no suggestions source element
| textbox role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element with a type attribute in the Week state
| No role, with the aria-readonly property set to "true" if the element has a readonly attribute
|
input element that is required
| The aria-required state set to "true"
|
keygen element
| No role |
label element
| No role |
link element that creates a hyperlink
| link role
|
menu element with a type attribute in the context menu state
| No role |
menu element with a type attribute in the list state
| menu role
|
menu element with a type attribute in the toolbar state
| toolbar role
|
meta element
| No role |
meter element
| No role |
nav element
| navigation role
|
noscript element
| No role |
optgroup element
| No role |
option element that is in a list of options or that represents a suggestion in a datalist element
| option role, with the aria-selected state set to "true" if the element's selectedness is true, or "false" otherwise.
|
param element
| No role |
progress element
| progressbar role, with, if the progress bar is determinate, the aria-valuemax property set to the maximum value of the progress bar, the aria-valuemin property set to zero, and the aria-valuenow property set to the current value of the progress bar
|
script element
| No role |
select element with a multiple attribute
| listbox role, with the aria-multiselectable property set to "true"
|
select element with no multiple attribute
| listbox role, with the aria-multiselectable property set to "false"
|
select element with a required attribute
| The aria-required state set to "true"
|
source element
| No role |
style element
| No role |
summary element
| No role |
textarea element
| textbox role, with the aria-multiline property set to "true", and the aria-readonly property set to "true" if the element has a readonly attribute
|
textarea element with a required attribute
| The aria-required state set to "true"
|
title element
| No role |
An element that defines a command, whose Type facet is "checkbox", and that is a descendant of a menu element whose type attribute in the list state
| menuitemcheckbox role, with the aria-checked state set to "true" if the command's Checked State facet is true, and "false" otherwise
|
An element that defines a command, whose Type facet is "command", and that is a descendant of a menu element whose type attribute in the list state
| menuitem role
|
An element that defines a command, whose Type facet is "radio", and that is a descendant of a menu element whose type attribute in the list state
| menuitemradio role, with the aria-checked state set to "true" if the command's Checked State facet is true, and "false" otherwise
|
| Element that is disabled | The aria-disabled state set to "true"
|
Element with a attribute
| The aria-hidden state set to "true"
|
| Element that is a candidate for constraint validation but that does not satisfy its constraints | The aria-invalid state set to "true"
|
Some HTML elements have native semantics that can be
overridden. The following table lists these elements and their
default implicit ARIA semantics, along with the
restrictions that apply to those elements. Each language feature
(element or attribute) in a cell in the first column implies, unless
otherwise overridden, the ARIA semantic (role, state, or property)
given in the cell in the second column of the same row, but this
semantic may be overridden under the conditions listed in the cell
in the third column of that row. In addition, any element may be
given the presentation role,
regardless of the restrictions below.
| Language feature | Default implied ARIA semantic | Restrictions |
|---|---|---|
a element that creates a hyperlink
| link role
| Role must be either link, button, checkbox, menuitem, menuitemcheckbox, menuitemradio, tab, or treeitem
|
address element
| No role | If specified, role must be contentinfo
|
article element
| article role
| Role must be either article, document, application, or main
|
aside element
| note role
| Role must be either note, complementary, or search
|
audio element
| No role | If specified, role must be application
|
button element
| button role
| Role must be either button, link, menuitem, menuitemcheckbox, menuitemradio, radio
|
details element
| group role
| Role must be a role that supports aria-expanded
|
embed element
| No role | If specified, role must be either application, document, or img
|
footer element
| No role | If specified, role must be contentinfo
|
h1 element that does not have an hgroup ancestor
| heading role, with the aria-level property set to the element's outline depth
| Role must be either heading or tab
|
h2 element that does not have an hgroup ancestor
| heading role, with the aria-level property set to the element's outline depth
| Role must be either heading or tab
|
h3 element that does not have an hgroup ancestor
| heading role, with the aria-level property set to the element's outline depth
| Role must be either heading or tab
|
h4 element that does not have an hgroup ancestor
| heading role, with the aria-level property set to the element's outline depth
| Role must be either heading or tab
|
h5 element that does not have an hgroup ancestor
| heading role, with the aria-level property set to the element's outline depth
| Role must be either heading or tab
|
h6 element that does not have an hgroup ancestor
| heading role, with the aria-level property set to the element's outline depth
| Role must be either heading or tab
|
header element
| No role | If specified, role must be banner
|
iframe element
| No role | If specified, role must be either application, document, or img
|
img element whose alt attribute's value is absent
| img role
| No restrictions |
img element whose alt attribute's value is present and not empty
| img role
| No restrictions |
input element with a type attribute in the Button state
| button role
| Role must be either button, link, menuitem, menuitemcheckbox, menuitemradio, radio
|
input element with a type attribute in the Checkbox state
| checkbox role
| Role must be either checkbox or menuitemcheckbox
|
input element with a type attribute in the Image Button state
| button role
| Role must be either button, link, menuitem, menuitemcheckbox, menuitemradio, radio
|
input element with a type attribute in the Radio Button state
| radio role
| Role must be either radio or menuitemradio
|
li element whose parent is an ol or ul element
| listitem role
| Role must be either listitem, menuitemcheckbox, menuitemradio, option, tab, or treeitem
|
object element
| No role | If specified, role must be either application, document, or img
|
ol element
| list role
| Role must be either directory, list, listbox, menu, menubar, tablist, toolbar, tree
|
output element
| status role
| No restrictions |
section element
| region role
| Role must be either
alert,
alertdialog,
application,
contentinfo,
dialog,
document,
log,
main,
marquee,
region,
search, or
status
|
ul element
| list role
| Role must be either directory, list, listbox, menu, menubar, tablist, toolbar, tree
|
video element
| No role | If specified, role must be application
|
| The body element | document role
| Role must be either document or application
|
The entry "no role", when
used as a strong native
semantic, means that no role other than presentation can be used.
When used as a default
implied ARIA semantic, it means the user agent has no default
mapping to ARIA roles. (However, it probably will have its own
mappings to the accessibility layer.)
These features can be used to make accessibility tools render content to their users in more useful ways. For example, ASCII art, which is really an image, appears to be text, and in the absence of appropriate annotations would end up being rendered by screen readers as a very painful reading of lots of punctuation. Using the features described in this section, one can instead make the ATs skip the ASCII art and just read the caption:
<figure role="img" aria-labelledby="fish-caption">
<pre>
o .'`/
' / (
O .-'` ` `'-._ .')
_/ (o) '. .' /
) ))) >< <
`\ |_\ _.' '. \
'-._ _ .-' '.)
jgs `\__\
</pre>
<figcaption id="fish-caption">
Joan G. Stark, "<cite>fish</cite>".
October 1997. ASCII on electrons. 28×8.
</figcaption>
</figure>
APIs for dynamically inserting markup into the document interact with the parser, and thus their behavior varies depending on whether they are used with HTML documents (and the HTML parser) or XHTML in XML documents (and the XML parser).
The open()
method comes in several variants with different numbers of
arguments.
open( [ type [, replace ] ] )Causes the Document to be replaced in-place, as if
it was a new Document object, but reusing the
previous object, which is then returned.
If the type argument is omitted or has the
value "text/html", then the resulting
Document has an HTML parser associated with it, which
can be given data to parse using document.write(). Otherwise, all
content passed to document.write() will be parsed
as plain text.
If the replace argument is present and has
the value "replace", the existing entries in
the session history for the Document object are
removed.
The method has no effect if the Document is still
being parsed.
Throws an InvalidStateError exception if the
Document is an XML
document.
open( url, name, features [, replace ] )Works like the window.open()
method.
close()Closes the input stream that was opened by the document.open() method.
Throws an InvalidStateError exception if the
Document is an XML
document.
document.write()write(text...)In general, adds the given string(s) to the
Document's input stream.
This method has very idiosyncratic behavior. In
some cases, this method can affect the state of the HTML
parser while the parser is running, resulting in a DOM that
does not correspond to the source of the document (e.g. if the
string written is the string "<plaintext>" or "<!--"). In other cases, the call can clear the
current page first, as if document.open() had been called.
In yet more cases, the method is simply ignored, or throws an
exception. To make matters worse, the exact behavior of this
method can in some cases be dependent on network latency, which can lead to failures that are very hard to debug.
For all these reasons, use of this method is strongly
discouraged.
This method throws an InvalidStateError exception
when invoked on XML documents.
document.writeln()writeln(text...)Adds the given string(s) to the Document's input
stream, followed by a newline character. If necessary, calls the
open() method implicitly
first.
This method throws an InvalidStateError exception
when invoked on XML documents.
The html element.
html elementhead element followed by a body element.manifestinterface HTMLHtmlElement : HTMLElement {};
The html element represents the root of
an HTML document.
The manifest
attribute gives the address of the document's application
cache manifest, if there is
one. If the attribute is present, the attribute's value must be a
valid non-empty URL potentially surrounded by
spaces.
The manifest attribute
only has an effect during
the early stages of document load. Changing the attribute
dynamically thus has no effect (and thus, no DOM API is provided for
this attribute).
For the purposes of application cache selection,
later base elements cannot affect the resolving of relative URLs in manifest attributes, as the
attributes are processed before those elements are seen.
The window.applicationCache IDL
attribute provides scripted access to the offline application
cache mechanism.
The html element in the following example declares
that the document's language is English.
<!DOCTYPE html> <html lang="en"> <head> <title>Swapping Songs</title> </head> <body> <h1>Swapping Songs</h1> <p>Tonight I swapped some of the songs I wrote with some friends, who gave me some of the songs they wrote. I love sharing my music.</p> </body> </html>
head elementhtml element.iframe srcdoc document or if title information is available from a higher-level protocol: Zero or more elements of metadata content.title element.interface HTMLHeadElement : HTMLElement {};
The head element represents a
collection of metadata for the Document.
The collection of metadata in a head element can be
large or small. Here is an example of a very short one:
<!doctype html> <html> <head> <title>A document with a short head</title> </head> <body> ...
Here is an example of a longer one:
<!DOCTYPE HTML> <HTML> <HEAD> <META CHARSET="UTF-8"> <BASE HREF="http://www.example.com/"> <TITLE>An application with a long head</TITLE> <LINK REL="STYLESHEET" HREF="default.css"> <LINK REL="STYLESHEET ALTERNATE" HREF="big.css" TITLE="Big Text"> <SCRIPT SRC="support.js"></SCRIPT> <META NAME="APPLICATION-NAME" CONTENT="Long headed application"> </HEAD> <BODY> ...
The title element is a required child
in most situations, but when a higher-level protocol provides title
information, e.g. in the Subject line of an e-mail when HTML is used
as an e-mail authoring format, the title element can be
omitted.
title elementhead element containing no other title elements.interface HTMLTitleElement : HTMLElement {
attribute DOMString text;
};
The title element represents the
document's title or name. Authors should use titles that identify
their documents even when they are used out of context, for example
in a user's history or bookmarks, or in search results. The
document's title is often different from its first heading, since the
first heading does not have to stand alone when taken out of
context.
There must be no more than one title element per
document.
text [ = value ]Returns the contents of the element, ignoring child nodes that
aren't Text nodes.
Can be set, to replace the element's children with the given value.
Here are some examples of appropriate titles, contrasted with the top-level headings that might be used on those same pages.
<title>Introduction to The Mating Rituals of Bees</title>
...
<h1>Introduction</h1>
<p>This companion guide to the highly successful
<cite>Introduction to Medieval Bee-Keeping</cite> book is...
The next page might be a part of the same site. Note how the title describes the subject matter unambiguously, while the first heading assumes the reader knows what the context is and therefore won't wonder if the dances are Salsa or Waltz:
<title>Dances used during bee mating rituals</title>
...
<h1>The Dances</h1>
The string to use as the document's title is given by the document.title IDL attribute.
base elementhead element containing no other base elements.hreftargetinterface HTMLBaseElement : HTMLElement {
attribute DOMString href;
attribute DOMString target;
};
The base element allows authors to specify the
document base URL for the purposes of resolving relative URLs, and the name
of the default browsing context for the purposes of
following hyperlinks. The element does not represent any content beyond this
information.
There must be no more than one base element per
document.
A base element must have either an href attribute, a target attribute, or both.
The href content
attribute, if specified, must contain a valid URL potentially
surrounded by spaces.
A base element, if it has an href attribute, must come before any
other elements in the tree that have attributes defined as taking
URLs, except the html element
(its manifest attribute
isn't affected by base elements).
The target
attribute, if specified, must contain a valid browsing context
name or keyword, which specifies which browsing
context is to be used as the default when hyperlinks and forms in the Document cause navigation.
A base element, if it has a target attribute, must come before
any elements in the tree that represent hyperlinks.
The href and target IDL attributes
must reflect the respective content attributes of the
same name.
In this example, a base element is used to set the
document base URL:
<!DOCTYPE html>
<html>
<head>
<title>This is an example for the <base> element</title>
<base href="http://www.example.com/news/index.html">
</head>
<body>
<p>Visit the <a href="archives.html">archives</a>.</p>
</body>
</html>
The link in the above example would be a link to "http://www.example.com/news/archives.html".
link elementnoscript element that is a child of a head element.hrefrelmediahreflangtypesizestitle attribute has special semantics on this element.interface HTMLLinkElement : HTMLElement {
attribute boolean disabled;
attribute DOMString href;
attribute DOMString rel;
readonly attribute DOMTokenList relList;
attribute DOMString media;
attribute DOMString hreflang;
attribute DOMString type;
[PutForwards=value] readonly attribute DOMSettableTokenList sizes;
};
HTMLLinkElement implements LinkStyle;
The link element allows authors to link their
document to other resources.
The destination of the link(s) is given by the href attribute, which must
be present and must contain a valid non-empty URL potentially
surrounded by spaces.
A link element must have rel attribute.
The types of link indicated (the relationships) are given by the
value of the rel
attribute, which, if present, must have a value that is a set
of space-separated tokens. The allowed
keywords and their meanings are defined in a later
section.
Two categories of links can be created using the
link element: Links to external resources and hyperlinks. The link
types section defines whether a particular link type is an
external resource or a hyperli