Network Working Group                                        D. Connolly
Internet-Draft                           World Wide Web Consortium (W3C)
Category: Informational                                         Aug 2007

HTML 5 rules for determining content types

Status: Internet-draft-to-be

I'm looking for a co-author to help route feedback from the IETF to the W3C HTML WG. @@

Please send comments to

$Revision: 1.1 $ of $Date: 2007-08-17 20:35:38 $


The HTTP specification[HTTP], in section 14.17 Content-Type, says The Content-Type entity-header field indicates the media type of the entity-body sent to the recipient.

The HTML 5 specification[HTML5] specifies an algorithm for determining content types based on widely deployed practices and software.

These specifications conflict in some cases. (@@ extract a test cases from Step 10 of Feed/HTML sniffing (part of detailed review of "Determining the type of a new resource in a browsing context"))

According to a straightforward architecture for content types in the Web[META], the HTTP specification should suffice and the HTML 5 specification need not specify another algorithm. But that architecture assumes that Web publishers (server adminstrators and content developers) reliably label content. Observing that labelling by Web publishers is widely unreliable, and software that works around these problems is widespread, the choices seem to be:

While the second option is unappealing, the first option seems infeasible.

The IETF community is invited to review the details of the HTML 5 algorithm in detail.



@@more context; meanwhile, see: Step 10 of Feed/HTML sniffing (part of detailed review of "Determining the type of a new resource in a browsing context") 17 Aug 007.

Jim Davis for his HTML->internet-draft tool (Makefile). Also keeping an eye on Transforming RFC2629-formatted XML through XSLT, but still grumpy that that format is so arbitrarily different from HTML.

ietf-xml-mime mailing list

Author's Address

Daniel W. Connolly
World Wide Web Consortum (W3C)
32 Vassar Street Cambridge, MA 02139, U.S.A.


HTML 5, work in progress 10 August 2007, Hickson and Hyatt, eds.
Hypertext Transfer Protocol -- HTTP/1.1 RFC2616 June 1999
Authoritative Metadata, W3C TAG April 2006