Annotation of libwww/Library/src/HTFormat.html, revision 2.63
2.10 timbl 1: <HTML>
2: <HEAD>
2.62 frystyk 3: <TITLE>W3C Reference Library libwww FORMAT NEGOTIATION</TITLE>
2.63 ! frystyk 4: <!-- Changed by: Henrik Frystyk Nielsen, 12-Apr-1996 -->
2.15 timbl 5: <NEXTID N="z18">
2.10 timbl 6: </HEAD>
2.1 timbl 7: <BODY>
2.27 frystyk 8:
2.31 frystyk 9: <H1>The Format Manager</H1>
2.33 frystyk 10:
11: <PRE>
12: /*
2.41 frystyk 13: ** (c) COPYRIGHT MIT 1995.
2.33 frystyk 14: ** Please first read the full copyright statement in the file COPYRIGH.
15: */
16: </PRE>
2.31 frystyk 17:
18: Here we describe the functions of the HTFormat module which handles
19: conversion between different data representations. (In MIME parlance,
20: a representation is known as a content-type. In <A
2.46 frystyk 21: HREF="http://www.w3.org/pub/WWW/TheProject.html">WWW</A> the
2.31 frystyk 22: term <EM>format</EM> is often used as it is shorter). The content of
23: this module is:
24:
25: <UL>
2.63 ! frystyk 26: <LI><A HREF="#converter">Content Type Converters</A>
! 27: <LI><A HREF="#coder">Content Coders</A>
! 28:
2.50 frystyk 29: <LI><A HREF="#user">Generic preferences (media type, language, charset etc.)</A>
30: <LI><A HREF="#global">Global Preferences</A>
2.42 frystyk 31: <LI><A HREF="#Rank">Content Negotiation</A>
2.31 frystyk 32: <LI><A HREF="#z3">The Stream Stack</A>
33: </UL>
34:
35: This module is implemented by <A HREF="HTFormat.c">HTFormat.c</A>, and
2.49 frystyk 36: it is a part of the <A HREF="http://www.w3.org/pub/WWW/Library/"> W3C
37: Reference Library</A>.
2.27 frystyk 38:
2.31 frystyk 39: <PRE>
40: #ifndef HTFORMAT_H
2.1 timbl 41: #define HTFORMAT_H
42:
2.63 ! frystyk 43: #include "<A HREF="HTUtils.html">HTUtils.h</A>"
! 44: #include "<A HREF="HTStream.html">HTStream.h</A>"
! 45: #include "<A HREF="HTAtom.html">HTAtom.h</A>"
! 46: #include "<A HREF="HTList.html">HTList.h</A>"
! 47: #include "<A HREF="HTAnchor.html">HTAnchor.h</A>"
! 48: #include "<A HREF="HTReq.html">HTReq.h</A>"
2.31 frystyk 49: </PRE>
2.1 timbl 50:
2.63 ! frystyk 51: <H2>Content Types</H2>
2.18 luotonen 52:
2.63 ! frystyk 53: This is the description of how we hande content types (media types)
! 54:
! 55: <A NAME="converter"><H3>Content Type Converters</H3></A>
! 56:
2.42 frystyk 57: A <CODE><A NAME="z12">converter</A></CODE> is a stream with a special
58: set of parameters and which is registered as capable of converting
59: from a MIME type to something else (maybe another MIME-type). A
60: converter is defined to be a function returning a stream and accepting
61: the following parameters. The content type elements are atoms for
62: which we have defined a prototype.
2.18 luotonen 63:
2.31 frystyk 64: <PRE>
2.52 frystyk 65: typedef HTStream * HTConverter (HTRequest * request,
66: void * param,
67: HTFormat input_format,
68: HTFormat output_format,
69: HTStream * output_stream);
2.42 frystyk 70: </PRE>
2.18 luotonen 71:
2.63 ! frystyk 72: <H3>The HTPresentation Object</H3>
2.31 frystyk 73:
2.42 frystyk 74: A <CODE>presenter</CODE> is a module (possibly an external program)
75: which can present a graphic object of a certain MIME type to the
76: user. That is, <CODE>presenters</CODE> are normally used to present
77: objects that the <CODE>converters</CODE> are not able to handle. Data
78: is transferred to the external program using for example the <A
79: HREF="HTFWrite.html">HTSaveAndExecute</A> stream which writes to a
80: local file. Both presenters and converters are of the type <A
81: HREF="#converter">HTConverter</A>.
2.31 frystyk 82:
83: <PRE>
2.42 frystyk 84: typedef struct _HTPresentation {
85: HTFormat rep; /* representation name atomized */
86: HTFormat rep_out; /* resulting representation */
87: HTConverter *converter; /* The routine to gen the stream stack */
88: char * command; /* MIME-format string */
89: char * test_command; /* MIME-format string */
90: double quality; /* Between 0 (bad) and 1 (good) */
91: double secs;
92: double secs_per_byte;
93: } HTPresentation;
2.28 frystyk 94: </PRE>
95:
2.63 ! frystyk 96: <H3>Basic Content type Converters</H3>
2.28 frystyk 97:
2.63 ! frystyk 98: We have a small set of basic converters that can be hooked in
! 99: anywhere. They don't "convert" anything but are nice to have.
! 100:
! 101: <PRE>
! 102: extern HTConverter HTThroughLine;
! 103: extern HTConverter HTBlackHoleConverter;
! 104: </PRE>
! 105:
! 106: <H3>Predefined Content Types</H3>
! 107:
2.42 frystyk 108: These macros (which used to be constants) define some basic internally
109: referenced representations. The <CODE>www/xxx</CODE> ones are of
2.52 frystyk 110: course not MIME standard. They are internal representations used in
111: the Library but they can't be exported to other apps!
2.28 frystyk 112:
113: <PRE>
2.57 frystyk 114: #define WWW_RAW HTAtom_for("www/void") /* Raw output from Protocol */
2.28 frystyk 115: </PRE>
116:
2.57 frystyk 117: <CODE>WWW_RAW</CODE> is an output format which leaves the input
2.54 frystyk 118: untouched <EM>exactly</EM> as it is received by the protocol
119: module. For example, in the case of FTP, this format returns raw ASCII
120: objects for directory listings; for HTTP, everything including the
121: header is returned, for Gopher, a raw ASCII object is returned for a
122: menu etc.
2.10 timbl 123:
2.28 frystyk 124: <PRE>
2.57 frystyk 125: #define WWW_SOURCE HTAtom_for("*/*") /* Almost what it was originally */
126: </PRE>
127:
128: <CODE>WWW_SOURCE</CODE> is an output format which leaves the input
129: untouched <EM>exactly</EM> as it is received by the protocol module
130: <B>IF</B> not a suitable converter has been registered with a quality
131: factor higher than 1 (for example 2). In this case the <EM>SUPER
132: CONVERTER</EM> is preferred for the raw output. This can be used as a
133: filter effect that allows conversion from, for example raw
134: FTPdirectory listings into HTML but passes a MIME body untouched.
135:
136: <PRE>
2.28 frystyk 137: #define WWW_PRESENT HTAtom_for("www/present") /* The user's perception */
138: </PRE>
139:
2.52 frystyk 140: <CODE>WWW_PRESENT</CODE> represents the user's perception of the
141: document. If you convert to <CODE>WWW_PRESENT</CODE>, you present the
142: material to the user.
2.58 frystyk 143:
144: <PRE>
145: #define WWW_DEBUG HTAtom_for("www/debug")
146: </PRE>
147:
148: <CODE>WWW_DEBUG</CODE> represents the user's perception of debug
149: information, for example sent as a HTML document in a HTTP redirection
150: message.
2.28 frystyk 151:
152: <PRE>
2.52 frystyk 153: #define WWW_UNKNOWN HTAtom_for("www/unknown")
2.28 frystyk 154: </PRE>
155:
2.52 frystyk 156: <CODE>WWW_UNKNOWN</CODE> is a really unknown type. It differs from the
157: real MIME type <EM>"application/octet-stream"</EM> in that we haven't
158: even tried to figure out the content type at this point.<P>
2.28 frystyk 159:
2.31 frystyk 160: These are regular MIME types defined. Others can be added!
2.28 frystyk 161:
162: <PRE>
2.52 frystyk 163: #define WWW_HTML HTAtom_for("text/html")
2.28 frystyk 164: #define WWW_PLAINTEXT HTAtom_for("text/plain")
2.52 frystyk 165:
166: #define WWW_MIME HTAtom_for("message/rfc822")
2.60 frystyk 167: #define WWW_MIME_HEAD HTAtom_for("message/x-rfc822-head")
2.52 frystyk 168:
2.10 timbl 169: #define WWW_AUDIO HTAtom_for("audio/basic")
2.52 frystyk 170:
2.26 frystyk 171: #define WWW_VIDEO HTAtom_for("video/mpeg")
2.52 frystyk 172:
2.38 frystyk 173: #define WWW_GIF HTAtom_for("image/gif")
2.63 ! frystyk 174: #define WWW_JPEG HTAtom_for("image/jpeg")
! 175: #define WWW_TIFF HTAtom_for("image/tiff")
2.52 frystyk 176: #define WWW_PNG HTAtom_for("image/png")
177:
178: #define WWW_BINARY HTAtom_for("application/octet-stream")
179: #define WWW_POSTSCRIPT HTAtom_for("application/postscript")
180: #define WWW_RICHTEXT HTAtom_for("application/rtf")
2.48 frystyk 181: </PRE>
182:
2.52 frystyk 183: We also have some MIME types that come from the various protocols when
184: we convert from ASCII to HTML.
2.48 frystyk 185:
186: <PRE>
187: #define WWW_GOPHER_MENU HTAtom_for("text/x-gopher")
2.53 frystyk 188: #define WWW_CSO_SEARCH HTAtom_for("text/x-cso")
2.48 frystyk 189:
190: #define WWW_FTP_LNST HTAtom_for("text/x-ftp-lnst")
191: #define WWW_FTP_LIST HTAtom_for("text/x-ftp-list")
192:
193: #define WWW_NNTP_LIST HTAtom_for("text/x-nntp-list")
194: #define WWW_NNTP_OVER HTAtom_for("text/x-nntp-over")
195: #define WWW_NNTP_HEAD HTAtom_for("text/x-nntp-head")
2.59 frystyk 196:
197: #define WWW_HTTP HTAtom_for("text/x-http")
2.55 frystyk 198: </PRE>
199:
200: Finally we have defined a special format for our RULE files as they
201: can be handled by a special converter.
202:
203: <PRE>
2.57 frystyk 204: #define WWW_RULES HTAtom_for("application/x-www-rules")
2.28 frystyk 205: </PRE>
206:
2.63 ! frystyk 207: <H3>Register Presenters</H3>
2.50 frystyk 208:
209: This function creates a presenter object and adds to the list of
210: conversions.
2.31 frystyk 211:
2.1 timbl 212: <DL>
2.31 frystyk 213: <DT>conversions
214: <DD>The list of <CODE>conveters</CODE> and <CODE>presenters</CODE>
2.50 frystyk 215: <DT>rep_in
2.42 frystyk 216: <DD>the MIME-style format name
2.50 frystyk 217: <DT>rep_out
218: <DD>is the resulting content-type after the conversion
219: <DT>converter
220: <DD>is the routine to call which actually does the conversion
2.1 timbl 221: <DT>quality
2.31 frystyk 222: <DD>A degradation faction [0..1]
2.1 timbl 223: <DT>maxbytes
2.31 frystyk 224: <DD>A limit on the length acceptable as input (0 infinite)
2.1 timbl 225: <DT>maxsecs
2.31 frystyk 226: <DD>A limit on the time user will wait (0 for infinity)
2.1 timbl 227: </DL>
228:
2.31 frystyk 229: <PRE>
2.49 frystyk 230: extern void HTPresentation_add (HTList * conversions,
2.61 frystyk 231: const char * representation,
232: const char * command,
233: const char * test_command,
2.49 frystyk 234: double quality,
235: double secs,
236: double secs_per_byte);
2.1 timbl 237:
2.50 frystyk 238: extern void HTPresentation_deleteAll (HTList * list);
239: </PRE>
240:
2.63 ! frystyk 241: <H3>Register Converters</H3>
2.50 frystyk 242:
243: This function creates a presenter object and adds to the list of
244: conversions.
2.1 timbl 245:
246: <DL>
2.31 frystyk 247: <DT>conversions
248: <DD>The list of <CODE>conveters</CODE> and <CODE>presenters</CODE>
2.1 timbl 249: <DT>rep_in
2.42 frystyk 250: <DD>the MIME-style format name
2.1 timbl 251: <DT>rep_out
2.31 frystyk 252: <DD>is the resulting content-type after the conversion
2.1 timbl 253: <DT>converter
2.31 frystyk 254: <DD>is the routine to call which actually does the conversion
255: <DT>quality
256: <DD>A degradation faction [0..1]
257: <DT>maxbytes
258: <DD>A limit on the length acceptable as input (0 infinite)
259: <DT>maxsecs
260: <DD>A limit on the time user will wait (0 for infinity)
2.1 timbl 261: </DL>
262:
263: <PRE>
2.49 frystyk 264: extern void HTConversion_add (HTList * conversions,
2.61 frystyk 265: const char * rep_in,
266: const char * rep_out,
2.49 frystyk 267: HTConverter * converter,
268: double quality,
269: double secs,
270: double secs_per_byte);
2.63 ! frystyk 271:
! 272: extern void HTConversion_deleteAll (HTList * list);
2.42 frystyk 273: </PRE>
274:
2.63 ! frystyk 275: <A NAME="coders"><H2>Content Codings</H2>
2.50 frystyk 276:
2.63 ! frystyk 277: Content codings are the HTTP extension of transfer encodings. Encodings include
! 278: compress, gzip etc.
! 279:
! 280: <H3>Content Encoders and Decoders</H3>
! 281:
! 282: <EM>Content Coders</EM> are subclassed from the <A HREF="HTStream.html">generic
! 283: stream class</A>. <EM>Encoders</EM> are capable of adding a content coding to a
! 284: data object and <EM>decoders</EM> can remove a content coding.
! 285:
2.50 frystyk 286: <PRE>
2.63 ! frystyk 287: typedef HTStream * HTContentCoder (HTRequest * request,
! 288: void * param,
! 289: HTEncoding coding,
! 290: HTStream * target);
2.50 frystyk 291: </PRE>
292:
2.63 ! frystyk 293: <H3>The HTContentCoding Object</H3>
2.42 frystyk 294:
2.63 ! frystyk 295: <PRE>
! 296: typedef struct _HTContentCoding HTContentCoding;
2.42 frystyk 297:
2.63 ! frystyk 298: extern const char * HTContentCoding_name (HTContentCoding * me);
! 299: </PRE>
! 300:
! 301: <H3>Predefined Coding Types</H4>
! 302:
2.42 frystyk 303: <PRE>
2.63 ! frystyk 304: #define WWW_CTE_7BIT HTAtom_for("7bit")
! 305: #define WWW_CTE_8BIT HTAtom_for("8bit")
! 306: #define WWW_CTE_BINARY HTAtom_for("binary")
! 307: #define WWW_CTE_BASE64 HTAtom_for("base64")
! 308: #define WWW_CTE_MACBINHEX HTAtom_for("macbinhex")
! 309:
! 310: #define WWW_CE_COMPRESS HTAtom_for("compress")
! 311: #define WWW_CE_GZIP HTAtom_for("gzip")
2.42 frystyk 312: </PRE>
313:
2.63 ! frystyk 314: <H3>Register Content Coders</H3>
2.42 frystyk 315:
2.63 ! frystyk 316: There is no difference in registrering a content encoder or a content decoder,
! 317: it all depends on how you use the list of encoders/decoders.
! 318:
2.42 frystyk 319: <PRE>
2.63 ! frystyk 320: extern BOOL HTContentCoding_add (HTList * list,
! 321: const char * encoding,
! 322: HTContentCoder * encoder,
! 323: HTContentCoder * decoder,
! 324: double quality);
! 325:
! 326: extern void HTContentCoding_deleteAll (HTList * list);
2.42 frystyk 327: </PRE>
328:
2.63 ! frystyk 329: <H3>Content Coding Stream Stack</H3>
2.42 frystyk 330:
2.63 ! frystyk 331: When creating a content coding stream stack, it is important that we keep the
! 332: right order of encoders and decoders. The HTTP spec specifies that the list in
! 333: the <EM>Content-Encoding</EM> header follows the order in which the encodings
! 334: have been applied to the object. Internally, we represent the content encodings
! 335: as <A HREF="HTAtom.html">atoms</A> in a linked <A HREF="HTList.html">list
! 336: object</A>.<P>
! 337:
! 338: The creation of the content coding stack is not based on quality factors as we
! 339: don't have the freedom as with content types. When using content codings we
! 340: <EM>must</EM> apply the codings specified or fail.
! 341:
2.42 frystyk 342: <PRE>
2.63 ! frystyk 343: extern HTStream * HTCodingStack (HTEncoding coding,
! 344: HTStream * target,
! 345: HTRequest * request,
! 346: void * param,
! 347: BOOL encoding);
2.42 frystyk 348: </PRE>
2.31 frystyk 349:
2.63 ! frystyk 350: Here you can provide a complete list instead of a single token. The list has to
! 351: be filled up in the order the _encodings_ are to be applied
2.31 frystyk 352:
2.50 frystyk 353: <PRE>
2.63 ! frystyk 354: extern HTStream * HTEncodingStack (HTList * encodings,
! 355: HTStream * target,
! 356: HTRequest * request,
! 357: void * param);
2.50 frystyk 358: </PRE>
359:
2.63 ! frystyk 360: Here you can provide a complete list instead of a single token. The list has to
! 361: be in the order the _encodings_ were applied - that is, the same way that
! 362: _encodings_ are to be applied. This is all consistent with the order of the
! 363: Content-Encoding header.
2.31 frystyk 364:
2.63 ! frystyk 365: <PRE>
! 366: extern HTStream * HTDecodingStack (HTList * encodings,
! 367: HTStream * target,
! 368: HTRequest * request,
! 369: void * param);
! 370: </PRE>
! 371:
! 372: <H2><A NAME="charset">Content Charsets</A></H2>
! 373:
2.42 frystyk 374: <H4>Register a Charset</H4>
375:
376: <PRE>
2.50 frystyk 377: extern void HTCharset_add (HTList * list,
2.61 frystyk 378: const char * charset,
2.50 frystyk 379: double quality);
2.42 frystyk 380: </PRE>
381:
2.50 frystyk 382: <H4>Delete a list of Charsets</H4>
2.42 frystyk 383:
2.50 frystyk 384: <PRE>
2.63 ! frystyk 385: typedef struct _HTAcceptNode {
! 386: HTAtom * atom;
! 387: double quality;
! 388: } HTAcceptNode;
! 389: </PRE>
! 390:
! 391: <PRE>
2.50 frystyk 392: extern void HTCharset_deleteAll (HTList * list);
393: </PRE>
394:
395: <A NAME="Language"><H3>Accepted Content Languages</H3></A>
2.31 frystyk 396:
2.42 frystyk 397: <H4>Register a Language</H4>
2.31 frystyk 398:
399: <PRE>
2.50 frystyk 400: extern void HTLanguage_add (HTList * list,
2.61 frystyk 401: const char * lang,
2.50 frystyk 402: double quality);
403: </PRE>
404:
405: <H4>Delete a list of Languages</H4>
406:
407: <PRE>
2.51 frystyk 408: extern void HTLanguage_deleteAll (HTList * list);
2.50 frystyk 409: </PRE>
410:
411: <A NAME="global"><H2>Global Registrations</H2></A>
412:
413: There are two places where these preferences can be registered: in a
414: <EM>global</EM> list valid for <B>all</B> requests and a
415: <EM>local</EM> list valid for a particular request only. These are
416: valid for <EM>all</EM> requests. See the <A HREF="HTReq.html">Request
417: Manager</A> fro local sets.
418:
2.51 frystyk 419: <H3>Converters and Presenters</H3>
2.50 frystyk 420:
421: The <EM>global</EM> list of specific conversions which the format
422: manager can do in order to fulfill the request. There is also a <A
423: HREF="HTReq.html"><EM>local</EM></A> list of conversions which
424: contains a generic set of possible conversions.
425:
426: <PRE>
427: extern void HTFormat_setConversion (HTList *list);
428: extern HTList * HTFormat_conversion (void);
2.31 frystyk 429: </PRE>
430:
2.63 ! frystyk 431: <H3>Content Encodings and Decodings</H3>
2.42 frystyk 432:
2.50 frystyk 433: <PRE>
434: extern void HTFormat_setEncoding (HTList *list);
435: extern HTList * HTFormat_encoding (void);
436: </PRE>
2.1 timbl 437:
2.51 frystyk 438: <H3>Content Languages</H3>
2.50 frystyk 439:
440: <PRE>
441: extern void HTFormat_setLanguage (HTList *list);
442: extern HTList * HTFormat_language (void);
443: </PRE>
2.42 frystyk 444:
2.51 frystyk 445: <H3>Content Charsets</H3>
2.1 timbl 446:
2.31 frystyk 447: <PRE>
2.50 frystyk 448: extern void HTFormat_setCharset (HTList *list);
449: extern HTList * HTFormat_charset (void);
2.31 frystyk 450: </PRE>
451:
2.50 frystyk 452: <H3>Delete All Global Lists</H3>
2.42 frystyk 453:
2.50 frystyk 454: This is a convenience function that might make life easier.
2.34 frystyk 455:
456: <PRE>
2.50 frystyk 457: extern void HTFormat_deleteAll (void);
2.34 frystyk 458: </PRE>
2.31 frystyk 459:
460: <A NAME="Rank"><H2>Ranking of Accepted Formats</H2></A>
461:
2.36 frystyk 462: This function is used when the best match among several possible
463: documents is to be found as a function of the accept headers sent in
464: the client request.
2.31 frystyk 465:
466: <PRE>
2.42 frystyk 467: typedef struct _HTContentDescription {
468: char * filename;
2.63 ! frystyk 469: HTFormat content_type;
! 470: HTLanguage content_language;
! 471: HTEncoding content_encoding;
! 472: HTCte content_transfer;
2.42 frystyk 473: int content_length;
474: double quality;
475: } HTContentDescription;
476:
2.52 frystyk 477: extern BOOL HTRank (HTList * possibilities,
478: HTList * accepted_content_types,
479: HTList * accepted_content_languages,
480: HTList * accepted_content_encodings);
2.1 timbl 481: </PRE>
2.31 frystyk 482:
2.42 frystyk 483: <H2><A NAME="z3">The Stream Stack</A></H2>
2.31 frystyk 484:
485: This is the routine which actually sets up the conversion. It
486: currently checks only for direct conversions, but multi-stage
487: conversions are forseen. It takes a stream into which the output
488: should be sent in the final format, builds the conversion stack, and
489: returns a stream into which the data in the input format should be
2.42 frystyk 490: fed. If <CODE>guess</CODE> is true and input format is
2.31 frystyk 491: <CODE>www/unknown</CODE>, try to guess the format by looking at the
492: first few bytes of the stream. <P>
2.1 timbl 493:
2.31 frystyk 494: <PRE>
2.52 frystyk 495: extern HTStream * HTStreamStack (HTFormat rep_in,
496: HTFormat rep_out,
497: HTStream * output_stream,
498: HTRequest * request,
499: BOOL guess);
2.1 timbl 500: </PRE>
2.31 frystyk 501:
2.63 ! frystyk 502: <H3>Cost of a Stream Stack</H3>
2.31 frystyk 503:
504: Must return the cost of the same stack which HTStreamStack would set
2.1 timbl 505: up.
506:
2.31 frystyk 507: <PRE>
2.52 frystyk 508: extern double HTStackValue (HTList * conversions,
509: HTFormat format_in,
510: HTFormat format_out,
511: double initial_value,
512: long int length);
2.31 frystyk 513:
2.42 frystyk 514: #endif /* HTFORMAT */
2.38 frystyk 515: </PRE>
516:
2.63 ! frystyk 517: <HR>
! 518: <ADDRESS>
! 519: @(#) $Id: Date Author State $
! 520: </ADDRESS>
2.31 frystyk 521: </BODY>
2.10 timbl 522: </HTML>
Webmaster