Annotation of libwww/Library/src/HTFormat.html, revision 2.35
2.10 timbl 1: <HTML>
2: <HEAD>
2.31 frystyk 3: <TITLE>The Format Manager in the WWW Library</TITLE>
2.15 timbl 4: <NEXTID N="z18">
2.10 timbl 5: </HEAD>
2.1 timbl 6: <BODY>
2.27 frystyk 7:
2.31 frystyk 8: <H1>The Format Manager</H1>
2.33 frystyk 9:
10: <PRE>
11: /*
12: ** (c) COPYRIGHT CERN 1994.
13: ** Please first read the full copyright statement in the file COPYRIGH.
14: */
15: </PRE>
2.31 frystyk 16:
17: Here we describe the functions of the HTFormat module which handles
18: conversion between different data representations. (In MIME parlance,
19: a representation is known as a content-type. In <A
20: HREF="http://info.cern.ch/hypertext/WWW/TheProject.html">WWW</A> the
21: term <EM>format</EM> is often used as it is shorter). The content of
22: this module is:
23:
24: <UL>
25: <LI><A HREF="#z15">Buffering for I/O</A>
26: <LI><A HREF="#Read">Read from the Network</A>
27: <LI><A HREF="#FormatTypes">Predefined Format Types</A>
28: <LI><A HREF="#Encoding">Registration of Accepted Content Encodings</A>
29: <LI><A HREF="#Language">Registration of Accepted Content Languages</A>
30: <LI><A HREF="#Type">Registration of Accepted Converters and Presenters</A>
31: <LI><A HREF="#Rank">Ranking of Accepted Formats</A>
32: <LI><A HREF="#z3">The Stream Stack</A>
33: </UL>
34:
35: This module is implemented by <A HREF="HTFormat.c">HTFormat.c</A>, and
36: it is a part of the <A NAME="z10"
37: HREF="http://info.cern.ch/hypertext/WWW/Library/User/Guide/Guide.html">Library
38: of Common Code</A>.
2.27 frystyk 39:
2.31 frystyk 40: <PRE>
41: #ifndef HTFORMAT_H
2.1 timbl 42: #define HTFORMAT_H
43:
2.35 ! roeber 44: #include <A HREF="sysdep.html">"sysdep.h"</A>
2.31 frystyk 45: #include <A HREF="HTUtils.html">"HTUtils.h"</A>
46: #include <A HREF="HTStream.html">"HTStream.h"</A>
47: #include <A HREF="HTAtom.html">"HTAtom.h"</A>
48: #include <A HREF="HTList.html">"HTList.h"</A>
2.1 timbl 49:
50: #ifdef SHORT_NAMES
51: #define HTOutputSource HTOuSour
52: #define HTOutputBinary HTOuBina
53: #endif
2.31 frystyk 54: </PRE>
2.1 timbl 55:
2.31 frystyk 56: <H2><A NAME="z15">Buffering for network I/O</A></H2>
2.18 luotonen 57:
2.31 frystyk 58: This routines provide buffering for READ (and future WRITE) to the
59: network. It is used by all the protocol modules. The size of the
60: buffer, <CODE>INPUT_BUFFER_SIZE</CODE>, is a compromis between speed
61: and memory.
2.18 luotonen 62:
2.31 frystyk 63: <PRE>
64: #define INPUT_BUFFER_SIZE 8192
2.18 luotonen 65:
2.17 luotonen 66: typedef struct _socket_buffer {
2.25 luotonen 67: char input_buffer[INPUT_BUFFER_SIZE];
2.17 luotonen 68: char * input_pointer;
69: char * input_limit;
70: int input_file_number;
71: } HTInputSocket;
2.31 frystyk 72: </PRE>
73:
74: <H3>Create an Input Buffer</H3>
2.17 luotonen 75:
2.31 frystyk 76: This function allocates a input buffer and binds it to the socket
77: descriptor given as parameter.
78:
79: <PRE>
80: extern HTInputSocket* HTInputSocket_new PARAMS((int file_number));
2.17 luotonen 81: </PRE>
82:
2.31 frystyk 83: <H3>Free an Input Buffer</H3>
84:
85: <PRE>
86: extern void HTInputSocket_free PARAMS((HTInputSocket * isoc));
2.17 luotonen 87: </PRE>
88:
2.31 frystyk 89: <A NAME="Read"><H2>Read from the Network</H2></A>
90:
91: This function has replaced many other functions for doing read from
92: the network. It automaticly converts from ASCII if we are on a
93: NON-ASCII machine. This assumes that we do <B>not</B> use this
94: function to read a local file on a NON-ASCII machine. The following
95: type definition is to make life easier when having a state machine
96: looking for a <CODE><CRLF></CODE> sequence.
97:
98: <PRE>
99: typedef enum _HTSocketEOL {
100: EOL_ERR = -1,
101: EOL_BEGIN = 0,
102: EOL_FCR,
103: EOL_FLF,
104: EOL_DOT,
105: EOL_SCR,
106: EOL_SLF
107: } HTSocketEOL;
2.17 luotonen 108:
2.32 frystyk 109: extern int HTInputSocket_read PARAMS((HTInputSocket * isoc,
2.31 frystyk 110: HTStream * target));
2.17 luotonen 111: </PRE>
112:
2.31 frystyk 113: <H2><A NAME="z11">Convert Net ASCII to local representation</A></H2>
2.22 timbl 114:
2.31 frystyk 115: This is a filter stream suitable for taking text from a socket and
116: passing it into a stream which expects text in the local C
117: representation. It does newline conversion. As usual, pass its
118: output stream to it when creating it.
2.17 luotonen 119:
2.31 frystyk 120: <PRE>
121: extern HTStream * HTNetToText PARAMS ((HTStream * sink));
2.17 luotonen 122: </PRE>
2.25 luotonen 123:
2.31 frystyk 124: <A NAME="FormatTypes"><H2>The HTFormat types</H2></A>
2.25 luotonen 125:
2.31 frystyk 126: We use the <A HREF="HTAtom.html">HTAtom</A> object for holding
127: representations. This allows faster manipulation (comparison and
128: copying) that if we stayed with strings. These macros (which used to
129: be constants) define some basic internally referenced
130: representations.<P>
2.25 luotonen 131:
2.31 frystyk 132: <PRE>
133: typedef HTAtom * HTFormat;
2.28 frystyk 134: </PRE>
135:
136: <H3>Internal ones</H3>
137:
2.31 frystyk 138: The <CODE>www/xxx</CODE> ones are of course not MIME
139: standard. <P>
2.28 frystyk 140:
2.31 frystyk 141: <CODE>star/star</CODE> is an output format which leaves the input
142: untouched. It is useful for diagnostics, and for users who want to see
143: the original, whatever it is.
2.28 frystyk 144:
145: <PRE>
146: #define WWW_SOURCE HTAtom_for("*/*") /* Whatever it was originally */
147: </PRE>
148:
2.31 frystyk 149: <CODE>www/present</CODE> represents the user's perception of the
150: document. If you convert to www/present, you present the material to
151: the user.
2.10 timbl 152:
2.28 frystyk 153: <PRE>
154: #define WWW_PRESENT HTAtom_for("www/present") /* The user's perception */
155: </PRE>
156:
157: The message/rfc822 format means a MIME message or a plain text message
158: with no MIME header. This is what is returned by an HTTP server.
159:
160: <PRE>
161: #define WWW_MIME HTAtom_for("www/mime") /* A MIME message */
162: </PRE>
163:
2.31 frystyk 164: <CODE>www/print</CODE> is like www/present except it represents a
165: printed copy.
2.28 frystyk 166:
167: <PRE>
168: #define WWW_PRINT HTAtom_for("www/print") /* A printed copy */
169: </PRE>
2.13 timbl 170:
2.31 frystyk 171: <CODE>www/unknown</CODE> is a really unknown type. Some default
172: action is appropriate.
2.13 timbl 173:
2.28 frystyk 174: <PRE>
175: #define WWW_UNKNOWN HTAtom_for("www/unknown")
2.13 timbl 176: </PRE>
2.28 frystyk 177:
178:
2.31 frystyk 179: <H3>MIME ones</H3>
2.28 frystyk 180:
2.31 frystyk 181: These are regular MIME types defined. Others can be added!
2.28 frystyk 182:
183: <PRE>
184: #define WWW_PLAINTEXT HTAtom_for("text/plain")
2.1 timbl 185: #define WWW_POSTSCRIPT HTAtom_for("application/postscript")
186: #define WWW_RICHTEXT HTAtom_for("application/rtf")
2.10 timbl 187: #define WWW_AUDIO HTAtom_for("audio/basic")
2.1 timbl 188: #define WWW_HTML HTAtom_for("text/html")
2.11 timbl 189: #define WWW_BINARY HTAtom_for("application/octet-stream")
2.26 frystyk 190: #define WWW_VIDEO HTAtom_for("video/mpeg")
2.28 frystyk 191: </PRE>
192:
2.31 frystyk 193: Extra types used in the library (EXPERIMENT)
2.28 frystyk 194:
195: <PRE>
196: #define WWW_NEWSLIST HTAtom_for("text/newslist")
197: </PRE>
2.7 timbl 198:
2.28 frystyk 199: We must include the following file after defining HTFormat, to which
2.10 timbl 200: it makes reference.
2.28 frystyk 201:
2.10 timbl 202: <H2>The HTEncoding type</H2>
203:
2.31 frystyk 204: <PRE>
205: typedef HTAtom* HTEncoding;
206: </PRE>
207:
208: The following are values for the MIME types:
209:
210: <PRE>
211: #define WWW_ENC_7BIT HTAtom_for("7bit")
2.10 timbl 212: #define WWW_ENC_8BIT HTAtom_for("8bit")
213: #define WWW_ENC_BINARY HTAtom_for("binary")
2.31 frystyk 214: </PRE>
215:
216: We also add
217:
218: <PRE>
219: #define WWW_ENC_COMPRESS HTAtom_for("compress")
220: </PRE>
2.10 timbl 221:
2.31 frystyk 222: <A NAME="Encoding"><H2>Registration of Accepted Content Encodings</H2></A>
2.10 timbl 223:
2.31 frystyk 224: This function is not currently used.
2.1 timbl 225:
2.31 frystyk 226: <PRE>
227: typedef struct _HTContentDescription {
228: char * filename;
229: HTAtom * content_type;
230: HTAtom * content_language;
231: HTAtom * content_encoding;
232: int content_length;
233: float quality;
234: } HTContentDescription;
235:
2.32 frystyk 236: extern void HTAcceptEncoding PARAMS((HTList * list,
2.31 frystyk 237: char * enc,
238: float quality));
2.1 timbl 239: </PRE>
240:
2.31 frystyk 241: <A NAME="Language"><H2>Registration of Accepted Content Languages</H2></A>
2.28 frystyk 242:
2.31 frystyk 243: This function is not currently used.
2.28 frystyk 244:
245: <PRE>
2.32 frystyk 246: extern void HTAcceptLanguage PARAMS((HTList * list,
2.31 frystyk 247: char * lang,
248: float quality));
249: </PRE>
2.28 frystyk 250:
2.31 frystyk 251: <A NAME="Type"><H2>Registration of Accepted Converters and
252: Presenters</H2></A>
253:
254: A <CODE><A NAME="z12">converter</A></CODE> is a stream with a special
255: set of parameters and which is registered as capable of converting
256: from a MIME type to something else (maybe another MIME-type). A
257: <CODE>presenter</CODE> is a module (possibly an external program)
258: which can present a graphic object of a certain MIME type to the
259: user. That is, <CODE>presenters</CODE> are normally used to present
260: graphic objects that the <CODE>converters</CODE> are not able to
261: handle. Data is transferred to the external program using the <A
262: HREF="HTFWriter.html">HTSaveAndExecute</A> stream which writes to a
263: local file. This stream is actually a normal <CODE>converter</CODE>,
264: e.g., at strem having the following set of parameters:<P>
265:
266: <PRE>
267: #include "HTAccess.h" /* Required for HTRequest definition */
268:
269: typedef HTStream * HTConverter PARAMS((HTRequest * request,
270: void * param,
271: HTFormat input_format,
272: HTFormat output_format,
273: HTStream * output_stream));
274: </PRE>
275:
276: Both <CODE>converters</CODE> and <CODE>presenters</CODE> are set up in
277: a list which is used by the <A HREF="#z3">StreamStack</A> module to
278: find the best way to pass the information to the user. <P>
279:
280: The <CODE>HTPresentation</CODE> structure contains both
281: <CODE>converters</CODE> and <CODE>presenters</CODE>, and it is defined
282: as:
283:
284: <PRE>
285: typedef struct _HTPresentation {
286: HTAtom* rep; /* representation name atomized */
287: HTAtom* rep_out; /* resulting representation */
288: HTConverter *converter; /* The routine to gen the stream stack */
289: char * command; /* MIME-format string */
290: char * test_command; /* MIME-format string */
291: float quality; /* Between 0 (bad) and 1 (good) */
2.1 timbl 292: float secs;
293: float secs_per_byte;
2.31 frystyk 294: } HTPresentation;
2.28 frystyk 295: </PRE>
296:
2.1 timbl 297:
2.31 frystyk 298: <A NAME="z17"><H3>Global List of Converters</H3>
2.1 timbl 299:
2.31 frystyk 300: This module keeps a global list of converters. This can be used to get
301: the set of supported formats. <P>
2.12 timbl 302:
2.31 frystyk 303: <B>NOTE:</B> There is also a conversion list associated with each
304: request in the <A HREF="HTAccess.html#z1">HTRequest structure</A>.
305:
306: <PRE>
307: extern HTList * HTConversions;
2.1 timbl 308: </PRE>
2.31 frystyk 309:
310: <H3>Register a Presenter</H3>
311:
2.1 timbl 312: <DL>
2.31 frystyk 313: <DT>conversions
314: <DD>The list of <CODE>conveters</CODE> and <CODE>presenters</CODE>
315: <DT>representation
316: <DD>the <A HREF="#Type">MIME-style</A> format name
2.1 timbl 317: <DT>command
2.31 frystyk 318: <DD>the MAILCAP-style command template
2.1 timbl 319: <DT>quality
2.31 frystyk 320: <DD>A degradation faction [0..1]
2.1 timbl 321: <DT>maxbytes
2.31 frystyk 322: <DD>A limit on the length acceptable as input (0 infinite)
2.1 timbl 323: <DT>maxsecs
2.31 frystyk 324: <DD>A limit on the time user will wait (0 for infinity)
2.1 timbl 325: </DL>
326:
2.31 frystyk 327: <PRE>
328: extern void HTSetPresentation PARAMS((HTList * conversions,
329: CONST char * representation,
330: CONST char * command,
331: CONST char * test_command,
332: float quality,
333: float secs,
334: float secs_per_byte));
335: </PRE>
2.1 timbl 336:
2.31 frystyk 337: <H3>Register a Converter</H3>
2.1 timbl 338:
339: <DL>
2.31 frystyk 340: <DT>conversions
341: <DD>The list of <CODE>conveters</CODE> and <CODE>presenters</CODE>
2.1 timbl 342: <DT>rep_in
2.31 frystyk 343: <DD>the <A HREF="#Type">MIME-style</A> format name
2.1 timbl 344: <DT>rep_out
2.31 frystyk 345: <DD>is the resulting content-type after the conversion
2.1 timbl 346: <DT>converter
2.31 frystyk 347: <DD>is the routine to call which actually does the conversion
348: <DT>quality
349: <DD>A degradation faction [0..1]
350: <DT>maxbytes
351: <DD>A limit on the length acceptable as input (0 infinite)
352: <DT>maxsecs
353: <DD>A limit on the time user will wait (0 for infinity)
2.1 timbl 354: </DL>
355:
356: <PRE>
2.31 frystyk 357: extern void HTSetConversion PARAMS((HTList * conversions,
358: CONST char * rep_in,
359: CONST char * rep_out,
360: HTConverter * converter,
361: float quality,
362: float secs,
363: float secs_per_byte));
364: </PRE>
365:
366: <H3>Set up Default Presenters and Converters</H3>
367:
368: A default set of <CODE>converters</CODE> and <CODE>presenters</CODE>
369: are defined in <A HREF="HTInit.c">HTInit.c</A> (or <A
370: HREF="../../Daemon/Inplementation/HTSUtils.c">HTSInit.c</A> in the
371: server). <P>
372:
373: <B>NOTE: </B> No automatic initialization is done in the Library, so
374: this is for the application to do
375:
376: <PRE>
377: extern void HTFormatInit PARAMS((HTList * conversions));
378: </PRE>
379:
380: This function also exists in a version where no
381: <CODE>presenters</CODE> are initialized. This is intended for Non
382: Interactive Mode, e.g., in the Line Mode Browser.
383:
384: <PRE>
385: extern void HTFormatInitNIM PARAMS((HTList * conversions));
386: </PRE>
387:
388: <H3>Remove presentations and conversions</H3>
2.1 timbl 389:
2.34 frystyk 390: This function deletes the <EM>LOCAL </EM>list of
391: <CODE>converters</CODE> and <CODE>presenters</CODE> associated with
392: each <A HREF="HTAccess.html#z1">HTRequest structure</A>.
2.1 timbl 393:
2.31 frystyk 394: <PRE>
395: extern void HTFormatDelete PARAMS((HTRequest * request));
396: </PRE>
397:
2.34 frystyk 398: This function cleans up the <EM>GLOBAL</EM> list of converters. The
399: function is called from <A
400: HREF="HTAccess.html#Library">HTLibTerminate</A>.
401:
402: <PRE>
403: extern void HTDisposeConversions NOPARAMS;
404: </PRE>
2.31 frystyk 405:
406: <A NAME="Rank"><H2>Ranking of Accepted Formats</H2></A>
407:
408: This function is not currently used.
409:
410: <PRE>
2.32 frystyk 411: extern BOOL HTRank PARAMS((HTList * possibilities,
2.31 frystyk 412: HTList * accepted_content_types,
413: HTList * accepted_content_languages,
414: HTList * accepted_content_encodings));
2.1 timbl 415: </PRE>
2.31 frystyk 416:
417: <H2><A NAME="z3">HTStreamStack</A></H2>
418:
419: This is the routine which actually sets up the conversion. It
420: currently checks only for direct conversions, but multi-stage
421: conversions are forseen. It takes a stream into which the output
422: should be sent in the final format, builds the conversion stack, and
423: returns a stream into which the data in the input format should be
424: fed.<P>
425:
2.23 luotonen 426: If <CODE>guess</CODE> is true and input format is
2.31 frystyk 427: <CODE>www/unknown</CODE>, try to guess the format by looking at the
428: first few bytes of the stream. <P>
2.1 timbl 429:
2.31 frystyk 430: <PRE>
2.32 frystyk 431: extern HTStream * HTStreamStack PARAMS((HTFormat rep_in,
2.31 frystyk 432: HTFormat rep_out,
433: HTStream * output_stream,
434: HTRequest * request,
435: BOOL guess));
2.1 timbl 436: </PRE>
2.31 frystyk 437:
438: <H3>Find the cost of a filter stack</H3>
439:
440: Must return the cost of the same stack which HTStreamStack would set
2.1 timbl 441: up.
442:
2.31 frystyk 443: <PRE>
2.1 timbl 444: #define NO_VALUE_FOUND -1e20 /* returned if none found */
445:
2.31 frystyk 446: extern float HTStackValue PARAMS((HTList * conversions,
447: HTFormat format_in,
448: HTFormat format_out,
449: float initial_value,
450: long int length));
2.1 timbl 451: </PRE>
2.31 frystyk 452:
453: <HR>
454:
455: <B><IMG
456: ALIGN=middle SRC="http://info.cern.ch/hypertext/WWW/Icons/32x32/caution.gif"> ATTENTION <IMG ALIGN=middle SRC="http://info.cern.ch/hypertext/WWW/Icons/32x32/caution.gif"> <P>
457:
458: THE REST OF THE FUNCTION DEFINED IN THIS MODULE ARE GOING TO BE
459: OBSOLETE SO DO NOT USE THEM - THEY ARE NOT REENTRANT.</B>
460:
461: <HR>
462:
2.1 timbl 463: <H2><A
2.10 timbl 464: NAME="z1">HTCopy: Copy a socket to a stream</A></H2>This is used by the protocol engines
2.6 secret 465: to send data down a stream, typically
2.22 timbl 466: one which has been generated by HTStreamStack.
467: Returns the number of bytes transferred.
2.19 luotonen 468: <PRE>extern int HTCopy PARAMS((
2.1 timbl 469: int file_number,
470: HTStream* sink));
471:
472:
2.6 secret 473: </PRE>
474: <H2><A
2.10 timbl 475: NAME="c6">HTFileCopy: Copy a file to a stream</A></H2>This is used by the protocol engines
2.6 secret 476: to send data down a stream, typically
2.7 timbl 477: one which has been generated by HTStreamStack.
478: It is currently called by <A
2.12 timbl 479: NAME="z9" HREF="#c7">HTParseFile</A>
2.6 secret 480: <PRE>extern void HTFileCopy PARAMS((
481: FILE* fp,
482: HTStream* sink));
483:
484:
2.7 timbl 485: </PRE>
486: <H2><A
2.10 timbl 487: NAME="c2">HTCopyNoCR: Copy a socket to a stream,
2.7 timbl 488: stripping CR characters.</A></H2>It is slower than <A
2.12 timbl 489: NAME="z2" HREF="#z1">HTCopy</A> .
2.1 timbl 490: <PRE>
491: extern void HTCopyNoCR PARAMS((
492: int file_number,
493: HTStream* sink));
494:
2.16 luotonen 495:
496: </PRE>
2.1 timbl 497: <H2>HTParseSocket: Parse a socket given
498: its format</H2>This routine is called by protocol
499: modules to load an object. uses<A
2.12 timbl 500: NAME="z4" HREF="#z3">
2.1 timbl 501: HTStreamStack</A> and the copy routines
502: above. Returns HT_LOADED if succesful,
503: <0 if not.
504: <PRE>extern int HTParseSocket PARAMS((
505: HTFormat format_in,
506: int file_number,
2.13 timbl 507: HTRequest * request));
2.6 secret 508:
509: </PRE>
510: <H2><A
2.10 timbl 511: NAME="c1">HTParseFile: Parse a File through
2.7 timbl 512: a file pointer</A></H2>This routine is called by protocols
513: modules to load an object. uses<A
2.12 timbl 514: NAME="z4" HREF="#z3"> HTStreamStack</A>
2.7 timbl 515: and <A
2.12 timbl 516: NAME="c7" HREF="#c6">HTFileCopy</A> . Returns HT_LOADED
2.7 timbl 517: if succesful, <0 if not.
2.6 secret 518: <PRE>extern int HTParseFile PARAMS((
519: HTFormat format_in,
520: FILE *fp,
2.13 timbl 521: HTRequest * request));
2.8 timbl 522:
523: </PRE>
2.21 frystyk 524:
2.31 frystyk 525:
526: <H3>Get next character from buffer</H3>
527:
528: <PRE>extern int HTInputSocket_getCharacter PARAMS((HTInputSocket* isoc));
2.21 frystyk 529: </PRE>
530:
2.31 frystyk 531: <H3>Read block from input socket</H3>Read *len characters and return a
532: buffer (don't free) containing *len
533: characters ( *len may have changed).
534: Buffer is not NULL-terminated.
535: <PRE>extern char * HTInputSocket_getBlock PARAMS((HTInputSocket * isoc,
536: int * len));
537:
2.32 frystyk 538: extern char * HTInputSocket_getLine PARAMS((HTInputSocket * isoc));
539: extern char * HTInputSocket_getUnfoldedLine PARAMS((HTInputSocket * isoc));
540: extern char * HTInputSocket_getStatusLine PARAMS((HTInputSocket * isoc));
541: extern BOOL HTInputSocket_seemsBinary PARAMS((HTInputSocket * isoc));
2.31 frystyk 542:
2.1 timbl 543: #endif
544:
2.31 frystyk 545: </PRE>
546:
547:
548: End of definition module
549:
550: </BODY>
2.10 timbl 551: </HTML>
Webmaster