Network Working Group                                        D. Connolly
Internet-Draft                           World Wide Web Consortium (W3C)
Category: Informational                                         Aug 2007
<draft-connolly-html5-type-sniffing-00.txt>

HTML 5 rules for determining content types

Status: Internet-draft-to-be

I'm looking for a co-author to help route feedback from the IETF to the W3C HTML WG. @@

Please send comments to public-html-comments@w3.org

$Revision: 1.1 $ of $Date: 2007-08-17 20:35:38 $

Introduction

The HTTP specification[HTTP], in section 14.17 Content-Type, says The Content-Type entity-header field indicates the media type of the entity-body sent to the recipient.

The HTML 5 specification[HTML5] specifies an algorithm for determining content types based on widely deployed practices and software.

These specifications conflict in some cases. (@@ extract a test cases from Step 10 of Feed/HTML sniffing (part of detailed review of "Determining the type of a new resource in a browsing context"))

According to a straightforward architecture for content types in the Web[META], the HTTP specification should suffice and the HTML 5 specification need not specify another algorithm. But that architecture assumes that Web publishers (server adminstrators and content developers) reliably label content. Observing that labelling by Web publishers is widely unreliable, and software that works around these problems is widespread, the choices seem to be:

While the second option is unappealing, the first option seems infeasible.

The IETF community is invited to review the details of the HTML 5 algorithm in detail.

replacement

Acknowledgements/@@Fodder

@@more context; meanwhile, see: Step 10 of Feed/HTML sniffing (part of detailed review of "Determining the type of a new resource in a browsing context") 17 Aug 007.

Jim Davis for his HTML->internet-draft tool (Makefile). Also keeping an eye on Transforming RFC2629-formatted XML through XSLT, but still grumpy that that format is so arbitrarily different from HTML.

ietf-xml-mime mailing list

Author's Address

Daniel W. Connolly
World Wide Web Consortum (W3C)
32 Vassar Street Cambridge, MA 02139, U.S.A.
mailto:connolly@w3.org
http://www.w3.org/People/Connolly/

References

[HTML5]
HTML 5, work in progress 10 August 2007, Hickson and Hyatt, eds.
[HTTP]
Hypertext Transfer Protocol -- HTTP/1.1 RFC2616 June 1999
[META]
Authoritative Metadata, W3C TAG April 2006