File:  [Public] / libwww / Library / src / HTFormat.html
Revision 2.57: download - view: text, annotated - select for diffs
Mon Nov 27 03:04:53 1995 UTC (28 years, 6 months ago) by frystyk
Branches: MAIN
CVS tags: v4/0, HEAD
alpha 7

<HTML>
<HEAD>
<TITLE>Format Negotiation Manager</TITLE>
<!-- Changed by: Henrik Frystyk Nielsen, 25-Nov-1995 -->
<NEXTID N="z18">
</HEAD>
<BODY>

<H1>The Format Manager</H1>

<PRE>
/*
**	(c) COPYRIGHT MIT 1995.
**	Please first read the full copyright statement in the file COPYRIGH.
*/
</PRE>

Here we describe the functions of the HTFormat module which handles
conversion between different data representations. (In MIME parlance,
a representation is known as a content-type. In <A
HREF="http://www.w3.org/pub/WWW/TheProject.html">WWW</A> the
term <EM>format</EM> is often used as it is shorter). The content of
this module is:

<UL>
<LI><A HREF="#converter">Converters</A>
<LI><A HREF="#user">Generic preferences (media type, language, charset etc.)</A>
<LI><A HREF="#global">Global Preferences</A>
<LI><A HREF="#Rank">Content Negotiation</A>
<LI><A HREF="#z3">The Stream Stack</A>
</UL>

This module is implemented by <A HREF="HTFormat.c">HTFormat.c</A>, and
it is a part of the <A HREF="http://www.w3.org/pub/WWW/Library/"> W3C
Reference Library</A>.

<PRE>
#ifndef HTFORMAT_H
#define HTFORMAT_H

#include <A HREF="HTUtils.html">"HTUtils.h"</A>
#include <A HREF="HTStream.html">"HTStream.h"</A>
#include <A HREF="HTAtom.html">"HTAtom.h"</A>
#include <A HREF="HTList.html">"HTList.h"</A>
#include <A HREF="HTAnchor.html">"HTAnchor.h"</A>
#include <A HREF="HTReq.html">"HTReq.h"</A>
</PRE>

<A NAME="converter"><H2>Stream Converters</H2></A>

A <CODE><A NAME="z12">converter</A></CODE> is a stream with a special
set of parameters and which is registered as capable of converting
from a MIME type to something else (maybe another MIME-type). A
converter is defined to be a function returning a stream and accepting
the following parameters. The content type elements are atoms for
which we have defined a prototype.

<PRE>
typedef HTStream * HTConverter	(HTRequest *	request,
				 void *		param,
				 HTFormat	input_format,
				 HTFormat	output_format,
				 HTStream *	output_stream);
</PRE>

<A NAME="user"><H2>Generic Preferences</H2></A>

The Library contains functionality for letting the application (or
user) express the preferences for the rendition of a given data object
when issuing a request. The categories supported are:

<UL>
<LI>Content type (media type)
<LI>Encoding
<LI>Language
<LI>Charset
</UL>

<A NAME="FormatTypes"><H3>Registration of Accepted Content Types</H3></A>

A <CODE>presenter</CODE> is a module (possibly an external program)
which can present a graphic object of a certain MIME type to the
user. That is, <CODE>presenters</CODE> are normally used to present
objects that the <CODE>converters</CODE> are not able to handle. Data
is transferred to the external program using for example the <A
HREF="HTFWrite.html">HTSaveAndExecute</A> stream which writes to a
local file. Both presenters and converters are of the type <A
HREF="#converter">HTConverter</A>.

<PRE>
typedef struct _HTPresentation {
    HTFormat	rep;			     /* representation name atomized */
    HTFormat	rep_out;			 /* resulting representation */
    HTConverter *converter;	      /* The routine to gen the stream stack */
    char *	command;			       /* MIME-format string */
    char *	test_command;			       /* MIME-format string */
    double	quality;		     /* Between 0 (bad) and 1 (good) */
    double	secs;
    double	secs_per_byte;
} HTPresentation;
</PRE>

<H4>Predefined Content Types</H4>

These macros (which used to be constants) define some basic internally
referenced representations. The <CODE>www/xxx</CODE> ones are of
course not MIME standard. They are internal representations used in
the Library but they can't be exported to other apps!

<PRE>
#define WWW_RAW		HTAtom_for("www/void")   /* Raw output from Protocol */
</PRE>

<CODE>WWW_RAW</CODE> is an output format which leaves the input
untouched <EM>exactly</EM> as it is received by the protocol
module. For example, in the case of FTP, this format returns raw ASCII
objects for directory listings; for HTTP, everything including the
header is returned, for Gopher, a raw ASCII object is returned for a
menu etc.

<PRE>
#define WWW_SOURCE	HTAtom_for("*/*")   /* Almost what it was originally */
</PRE>

<CODE>WWW_SOURCE</CODE> is an output format which leaves the input
untouched <EM>exactly</EM> as it is received by the protocol module
<B>IF</B> not a suitable converter has been registered with a quality
factor higher than 1 (for example 2). In this case the <EM>SUPER
CONVERTER</EM> is preferred for the raw output. This can be used as a
filter effect that allows conversion from, for example raw
FTPdirectory listings into HTML but passes a MIME body untouched.

<PRE>
#define WWW_PRESENT	HTAtom_for("www/present")   /* The user's perception */
</PRE>

<CODE>WWW_PRESENT</CODE> represents the user's perception of the
document.  If you convert to <CODE>WWW_PRESENT</CODE>, you present the
material to the user.

<PRE>
#define WWW_UNKNOWN     HTAtom_for("www/unknown")
</PRE>

<CODE>WWW_UNKNOWN</CODE> is a really unknown type. It differs from the
real MIME type <EM>"application/octet-stream"</EM> in that we haven't
even tried to figure out the content type at this point.<P>

These are regular MIME types defined. Others can be added!

<PRE>
#define WWW_HTML 	HTAtom_for("text/html")
#define WWW_PLAINTEXT 	HTAtom_for("text/plain")

#define WWW_MIME	HTAtom_for("message/rfc822")

#define WWW_AUDIO       HTAtom_for("audio/basic")

#define WWW_VIDEO 	HTAtom_for("video/mpeg")

#define WWW_GIF 	HTAtom_for("image/gif")
#define WWW_PNG 	HTAtom_for("image/png")

#define WWW_BINARY 	HTAtom_for("application/octet-stream")
#define WWW_POSTSCRIPT 	HTAtom_for("application/postscript")
#define WWW_RICHTEXT 	HTAtom_for("application/rtf")
</PRE>

We also have some MIME types that come from the various protocols when
we convert from ASCII to HTML.

<PRE>
#define WWW_GOPHER_MENU HTAtom_for("text/x-gopher")
#define WWW_CSO_SEARCH	HTAtom_for("text/x-cso")

#define WWW_FTP_LNST	HTAtom_for("text/x-ftp-lnst")
#define WWW_FTP_LIST	HTAtom_for("text/x-ftp-list")

#define WWW_NNTP_LIST   HTAtom_for("text/x-nntp-list")
#define WWW_NNTP_OVER	HTAtom_for("text/x-nntp-over")
#define WWW_NNTP_HEAD	HTAtom_for("text/x-nntp-head")
</PRE>

Finally we have defined a special format for our RULE files as they
can be handled by a special converter.

<PRE>
#define WWW_RULES	HTAtom_for("application/x-www-rules")
</PRE>

<H4>Add a Presenter</H4>

This function creates a presenter object and adds to the list of
conversions.

<DL>
<DT>conversions
<DD>The list of <CODE>conveters</CODE> and <CODE>presenters</CODE>
<DT>rep_in
<DD>the MIME-style format name
<DT>rep_out
<DD>is the resulting content-type after the conversion
<DT>converter
<DD>is the routine to call which actually does the conversion
<DT>quality
<DD>A degradation faction [0..1]
<DT>maxbytes
<DD>A limit on the length acceptable as input (0 infinite)
<DT>maxsecs
<DD>A limit on the time user will wait (0 for infinity)
</DL>

<PRE>
extern void HTPresentation_add (HTList *	conversions,
				CONST char * 	representation,
				CONST char * 	command,
				CONST char * 	test_command,
				double		quality,
				double		secs, 
				double		secs_per_byte);
</PRE>

<H4>Delete a list of Presenters</H4>

<PRE>
extern void HTPresentation_deleteAll	(HTList * list);
</PRE>

<H4>Add a Converter</H4>

This function creates a presenter object and adds to the list of
conversions.

<DL>
<DT>conversions
<DD>The list of <CODE>conveters</CODE> and <CODE>presenters</CODE>
<DT>rep_in
<DD>the MIME-style format name
<DT>rep_out
<DD>is the resulting content-type after the conversion
<DT>converter
<DD>is the routine to call which actually does the conversion
<DT>quality
<DD>A degradation faction [0..1]
<DT>maxbytes
<DD>A limit on the length acceptable as input (0 infinite)
<DT>maxsecs
<DD>A limit on the time user will wait (0 for infinity)
</DL>

<PRE>
extern void HTConversion_add   (HTList *	conversions,
				CONST char * 	rep_in,
				CONST char * 	rep_out,
				HTConverter *	converter,
				double		quality,
				double		secs, 
				double		secs_per_byte);
</PRE>

<H4>Delete a list of Converters</H4>

<PRE>
extern void HTConversion_deleteAll	(HTList * list);
</PRE>

<A NAME="Encoding"><H3>Registration of Accepted Content Encodings</H3></A>

Encodings are the HTTP extension of transfer encodings. Encodings
include compress, gzip etc.

<PRE>
typedef struct _HTAcceptNode {
    HTAtom *	atom;
    double	quality;
} HTAcceptNode;
</PRE>

<H4>Predefined Encoding Types</H4>

<PRE>
#define WWW_ENC_7BIT		HTAtom_for("7bit")
#define WWW_ENC_8BIT		HTAtom_for("8bit")
#define WWW_ENC_BINARY		HTAtom_for("binary")
#define WWW_ENC_BASE64		HTAtom_for("base64")
#define WWW_ENC_COMPRESS	HTAtom_for("compress")
#define WWW_ENC_GZIP		HTAtom_for("gzip")
</PRE>

<H4>Register an Encoding</H4>

<PRE>
extern void HTEncoding_add (HTList * 		list,
			    CONST char *	enc,
			    double		quality);
</PRE>

<H4>Delete a list of Encoders</H4>

<PRE>
extern void HTEncoding_deleteAll (HTList * list);
</PRE>

<H3><A NAME="charset">Accepted Charsets</A></H3>

<H4>Register a Charset</H4>

<PRE>
extern void HTCharset_add (HTList *		list,
			   CONST char *		charset,
			   double		quality);
</PRE>

<H4>Delete a list of Charsets</H4>

<PRE>
extern void HTCharset_deleteAll	(HTList * list);
</PRE>

<A NAME="Language"><H3>Accepted Content Languages</H3></A>

<H4>Register a Language</H4>

<PRE>
extern void HTLanguage_add (HTList *		list,
			    CONST char *	lang,
			    double		quality);
</PRE>

<H4>Delete a list of Languages</H4>

<PRE>
extern void HTLanguage_deleteAll (HTList * list);
</PRE>

<A NAME="global"><H2>Global Registrations</H2></A>

There are two places where these preferences can be registered: in a
<EM>global</EM> list valid for <B>all</B> requests and a
<EM>local</EM> list valid for a particular request only. These are
valid for <EM>all</EM> requests. See the <A HREF="HTReq.html">Request
Manager</A> fro local sets.

<H3>Converters and Presenters</H3>

The <EM>global</EM> list of specific conversions which the format
manager can do in order to fulfill the request.  There is also a <A
HREF="HTReq.html"><EM>local</EM></A> list of conversions which
contains a generic set of possible conversions.

<PRE>
extern void HTFormat_setConversion	(HTList *list);
extern HTList * HTFormat_conversion	(void);
</PRE>

<H3>Content Encodings</H3>

<PRE>
extern void HTFormat_setEncoding	(HTList *list);
extern HTList * HTFormat_encoding	(void);
</PRE>

<H3>Content Languages</H3>

<PRE>
extern void HTFormat_setLanguage	(HTList *list);
extern HTList * HTFormat_language	(void);
</PRE>

<H3>Content Charsets</H3>

<PRE>
extern void HTFormat_setCharset		(HTList *list);
extern HTList * HTFormat_charset	(void);
</PRE>

<H3>Delete All Global Lists</H3>

This is a convenience function that might make life easier.

<PRE>
extern void HTFormat_deleteAll (void);
</PRE>

<A NAME="Rank"><H2>Ranking of Accepted Formats</H2></A>

This function is used when the best match among several possible
documents is to be found as a function of the accept headers sent in
the client request.

<PRE>
typedef struct _HTContentDescription {
    char *	filename;
    HTAtom *	content_type;
    HTAtom *	content_language;
    HTAtom *	content_encoding;
    int		content_length;
    double	quality;
} HTContentDescription;

extern BOOL HTRank (HTList * possibilities,
		    HTList * accepted_content_types,
		    HTList * accepted_content_languages,
		    HTList * accepted_content_encodings);
</PRE>

<H2><A NAME="z3">The Stream Stack</A></H2>

This is the routine which actually sets up the conversion. It
currently checks only for direct conversions, but multi-stage
conversions are forseen.  It takes a stream into which the output
should be sent in the final format, builds the conversion stack, and
returns a stream into which the data in the input format should be
fed. If <CODE>guess</CODE> is true and input format is
<CODE>www/unknown</CODE>, try to guess the format by looking at the
first few bytes of the stream. <P>

<PRE>
extern HTStream * HTStreamStack (HTFormat	rep_in,
				 HTFormat	rep_out,
				 HTStream *	output_stream,
				 HTRequest *	request,
				 BOOL		guess);
</PRE>

<H2>Cost of a Stream Stack</H2>

Must return the cost of the same stack which HTStreamStack would set
up.

<PRE>
extern double HTStackValue	(HTList *	conversions,
				 HTFormat	format_in,
				 HTFormat	format_out,
				 double		initial_value,
				 long int	length);

#endif /* HTFORMAT */
</PRE>

End of declaration module

</BODY>
</HTML>

Webmaster