Annotation of libwww/Library/src/HTFormat.html, revision 2.31
2.10 timbl 1: <HTML>
2: <HEAD>
2.31 ! frystyk 3: <TITLE>The Format Manager in the WWW Library</TITLE>
2.15 timbl 4: <NEXTID N="z18">
2.10 timbl 5: </HEAD>
2.31 ! frystyk 6:
2.1 timbl 7: <BODY>
2.27 frystyk 8:
2.31 ! frystyk 9: <H1>The Format Manager</H1>
! 10:
! 11: Here we describe the functions of the HTFormat module which handles
! 12: conversion between different data representations. (In MIME parlance,
! 13: a representation is known as a content-type. In <A
! 14: HREF="http://info.cern.ch/hypertext/WWW/TheProject.html">WWW</A> the
! 15: term <EM>format</EM> is often used as it is shorter). The content of
! 16: this module is:
! 17:
! 18: <UL>
! 19: <LI><A HREF="#z15">Buffering for I/O</A>
! 20: <LI><A HREF="#Read">Read from the Network</A>
! 21: <LI><A HREF="#FormatTypes">Predefined Format Types</A>
! 22: <LI><A HREF="#Encoding">Registration of Accepted Content Encodings</A>
! 23: <LI><A HREF="#Language">Registration of Accepted Content Languages</A>
! 24: <LI><A HREF="#Type">Registration of Accepted Converters and Presenters</A>
! 25: <LI><A HREF="#Rank">Ranking of Accepted Formats</A>
! 26: <LI><A HREF="#z3">The Stream Stack</A>
! 27: </UL>
! 28:
! 29: This module is implemented by <A HREF="HTFormat.c">HTFormat.c</A>, and
! 30: it is a part of the <A NAME="z10"
! 31: HREF="http://info.cern.ch/hypertext/WWW/Library/User/Guide/Guide.html">Library
! 32: of Common Code</A>.
2.27 frystyk 33:
2.31 ! frystyk 34: <PRE>
! 35: #ifndef HTFORMAT_H
2.1 timbl 36: #define HTFORMAT_H
37:
2.31 ! frystyk 38: #include <A HREF="HTUtils.html">"HTUtils.h"</A>
! 39: #include <A HREF="HTStream.html">"HTStream.h"</A>
! 40: #include <A HREF="HTAtom.html">"HTAtom.h"</A>
! 41: #include <A HREF="HTList.html">"HTList.h"</A>
2.1 timbl 42:
43: #ifdef SHORT_NAMES
44: #define HTOutputSource HTOuSour
45: #define HTOutputBinary HTOuBina
46: #endif
2.31 ! frystyk 47: </PRE>
2.1 timbl 48:
2.31 ! frystyk 49: <H2><A NAME="z15">Buffering for network I/O</A></H2>
2.18 luotonen 50:
2.31 ! frystyk 51: This routines provide buffering for READ (and future WRITE) to the
! 52: network. It is used by all the protocol modules. The size of the
! 53: buffer, <CODE>INPUT_BUFFER_SIZE</CODE>, is a compromis between speed
! 54: and memory.
2.18 luotonen 55:
2.31 ! frystyk 56: <PRE>
! 57: #define INPUT_BUFFER_SIZE 8192
2.18 luotonen 58:
2.17 luotonen 59: typedef struct _socket_buffer {
2.25 luotonen 60: char input_buffer[INPUT_BUFFER_SIZE];
2.17 luotonen 61: char * input_pointer;
62: char * input_limit;
63: int input_file_number;
64: } HTInputSocket;
2.31 ! frystyk 65: </PRE>
! 66:
! 67: <H3>Create an Input Buffer</H3>
2.17 luotonen 68:
2.31 ! frystyk 69: This function allocates a input buffer and binds it to the socket
! 70: descriptor given as parameter.
! 71:
! 72: <PRE>
! 73: extern HTInputSocket* HTInputSocket_new PARAMS((int file_number));
2.17 luotonen 74: </PRE>
75:
2.31 ! frystyk 76: <H3>Free an Input Buffer</H3>
! 77:
! 78: <PRE>
! 79: extern void HTInputSocket_free PARAMS((HTInputSocket * isoc));
2.17 luotonen 80: </PRE>
81:
2.31 ! frystyk 82: <A NAME="Read"><H2>Read from the Network</H2></A>
! 83:
! 84: This function has replaced many other functions for doing read from
! 85: the network. It automaticly converts from ASCII if we are on a
! 86: NON-ASCII machine. This assumes that we do <B>not</B> use this
! 87: function to read a local file on a NON-ASCII machine. The following
! 88: type definition is to make life easier when having a state machine
! 89: looking for a <CODE><CRLF></CODE> sequence.
! 90:
! 91: <PRE>
! 92: typedef enum _HTSocketEOL {
! 93: EOL_ERR = -1,
! 94: EOL_BEGIN = 0,
! 95: EOL_FCR,
! 96: EOL_FLF,
! 97: EOL_DOT,
! 98: EOL_SCR,
! 99: EOL_SLF
! 100: } HTSocketEOL;
2.17 luotonen 101:
2.31 ! frystyk 102: PUBLIC int HTInputSocket_read PARAMS((HTInputSocket * isoc,
! 103: HTStream * target));
2.17 luotonen 104: </PRE>
105:
2.31 ! frystyk 106: <H2><A NAME="z11">Convert Net ASCII to local representation</A></H2>
2.22 timbl 107:
2.31 ! frystyk 108: This is a filter stream suitable for taking text from a socket and
! 109: passing it into a stream which expects text in the local C
! 110: representation. It does newline conversion. As usual, pass its
! 111: output stream to it when creating it.
2.17 luotonen 112:
2.31 ! frystyk 113: <PRE>
! 114: extern HTStream * HTNetToText PARAMS ((HTStream * sink));
2.17 luotonen 115: </PRE>
2.25 luotonen 116:
2.31 ! frystyk 117: <A NAME="FormatTypes"><H2>The HTFormat types</H2></A>
2.25 luotonen 118:
2.31 ! frystyk 119: We use the <A HREF="HTAtom.html">HTAtom</A> object for holding
! 120: representations. This allows faster manipulation (comparison and
! 121: copying) that if we stayed with strings. These macros (which used to
! 122: be constants) define some basic internally referenced
! 123: representations.<P>
2.25 luotonen 124:
2.31 ! frystyk 125: <PRE>
! 126: typedef HTAtom * HTFormat;
2.28 frystyk 127: </PRE>
128:
129: <H3>Internal ones</H3>
130:
2.31 ! frystyk 131: The <CODE>www/xxx</CODE> ones are of course not MIME
! 132: standard. <P>
2.28 frystyk 133:
2.31 ! frystyk 134: <CODE>star/star</CODE> is an output format which leaves the input
! 135: untouched. It is useful for diagnostics, and for users who want to see
! 136: the original, whatever it is.
2.28 frystyk 137:
138: <PRE>
139: #define WWW_SOURCE HTAtom_for("*/*") /* Whatever it was originally */
140: </PRE>
141:
2.31 ! frystyk 142: <CODE>www/present</CODE> represents the user's perception of the
! 143: document. If you convert to www/present, you present the material to
! 144: the user.
2.10 timbl 145:
2.28 frystyk 146: <PRE>
147: #define WWW_PRESENT HTAtom_for("www/present") /* The user's perception */
148: </PRE>
149:
150: The message/rfc822 format means a MIME message or a plain text message
151: with no MIME header. This is what is returned by an HTTP server.
152:
153: <PRE>
154: #define WWW_MIME HTAtom_for("www/mime") /* A MIME message */
155: </PRE>
156:
2.31 ! frystyk 157: <CODE>www/print</CODE> is like www/present except it represents a
! 158: printed copy.
2.28 frystyk 159:
160: <PRE>
161: #define WWW_PRINT HTAtom_for("www/print") /* A printed copy */
162: </PRE>
2.13 timbl 163:
2.31 ! frystyk 164: <CODE>www/unknown</CODE> is a really unknown type. Some default
! 165: action is appropriate.
2.13 timbl 166:
2.28 frystyk 167: <PRE>
168: #define WWW_UNKNOWN HTAtom_for("www/unknown")
2.13 timbl 169: </PRE>
2.28 frystyk 170:
171:
2.31 ! frystyk 172: <H3>MIME ones</H3>
2.28 frystyk 173:
2.31 ! frystyk 174: These are regular MIME types defined. Others can be added!
2.28 frystyk 175:
176: <PRE>
177: #define WWW_PLAINTEXT HTAtom_for("text/plain")
2.1 timbl 178: #define WWW_POSTSCRIPT HTAtom_for("application/postscript")
179: #define WWW_RICHTEXT HTAtom_for("application/rtf")
2.10 timbl 180: #define WWW_AUDIO HTAtom_for("audio/basic")
2.1 timbl 181: #define WWW_HTML HTAtom_for("text/html")
2.11 timbl 182: #define WWW_BINARY HTAtom_for("application/octet-stream")
2.26 frystyk 183: #define WWW_VIDEO HTAtom_for("video/mpeg")
2.28 frystyk 184: </PRE>
185:
2.31 ! frystyk 186: Extra types used in the library (EXPERIMENT)
2.28 frystyk 187:
188: <PRE>
189: #define WWW_NEWSLIST HTAtom_for("text/newslist")
190: </PRE>
2.7 timbl 191:
2.28 frystyk 192: We must include the following file after defining HTFormat, to which
2.10 timbl 193: it makes reference.
2.28 frystyk 194:
2.10 timbl 195: <H2>The HTEncoding type</H2>
196:
2.31 ! frystyk 197: <PRE>
! 198: typedef HTAtom* HTEncoding;
! 199: </PRE>
! 200:
! 201: The following are values for the MIME types:
! 202:
! 203: <PRE>
! 204: #define WWW_ENC_7BIT HTAtom_for("7bit")
2.10 timbl 205: #define WWW_ENC_8BIT HTAtom_for("8bit")
206: #define WWW_ENC_BINARY HTAtom_for("binary")
2.31 ! frystyk 207: </PRE>
! 208:
! 209: We also add
! 210:
! 211: <PRE>
! 212: #define WWW_ENC_COMPRESS HTAtom_for("compress")
! 213: </PRE>
2.10 timbl 214:
2.31 ! frystyk 215: <A NAME="Encoding"><H2>Registration of Accepted Content Encodings</H2></A>
2.10 timbl 216:
2.31 ! frystyk 217: This function is not currently used.
2.1 timbl 218:
2.31 ! frystyk 219: <PRE>
! 220: typedef struct _HTContentDescription {
! 221: char * filename;
! 222: HTAtom * content_type;
! 223: HTAtom * content_language;
! 224: HTAtom * content_encoding;
! 225: int content_length;
! 226: float quality;
! 227: } HTContentDescription;
! 228:
! 229: PUBLIC void HTAcceptEncoding PARAMS((HTList * list,
! 230: char * enc,
! 231: float quality));
2.1 timbl 232: </PRE>
233:
2.31 ! frystyk 234: <A NAME="Language"><H2>Registration of Accepted Content Languages</H2></A>
2.28 frystyk 235:
2.31 ! frystyk 236: This function is not currently used.
2.28 frystyk 237:
238: <PRE>
2.31 ! frystyk 239: PUBLIC void HTAcceptLanguage PARAMS((HTList * list,
! 240: char * lang,
! 241: float quality));
! 242: </PRE>
2.28 frystyk 243:
2.31 ! frystyk 244: <A NAME="Type"><H2>Registration of Accepted Converters and
! 245: Presenters</H2></A>
! 246:
! 247: A <CODE><A NAME="z12">converter</A></CODE> is a stream with a special
! 248: set of parameters and which is registered as capable of converting
! 249: from a MIME type to something else (maybe another MIME-type). A
! 250: <CODE>presenter</CODE> is a module (possibly an external program)
! 251: which can present a graphic object of a certain MIME type to the
! 252: user. That is, <CODE>presenters</CODE> are normally used to present
! 253: graphic objects that the <CODE>converters</CODE> are not able to
! 254: handle. Data is transferred to the external program using the <A
! 255: HREF="HTFWriter.html">HTSaveAndExecute</A> stream which writes to a
! 256: local file. This stream is actually a normal <CODE>converter</CODE>,
! 257: e.g., at strem having the following set of parameters:<P>
! 258:
! 259: <PRE>
! 260: #include "HTAccess.h" /* Required for HTRequest definition */
! 261:
! 262: typedef HTStream * HTConverter PARAMS((HTRequest * request,
! 263: void * param,
! 264: HTFormat input_format,
! 265: HTFormat output_format,
! 266: HTStream * output_stream));
! 267: </PRE>
! 268:
! 269: Both <CODE>converters</CODE> and <CODE>presenters</CODE> are set up in
! 270: a list which is used by the <A HREF="#z3">StreamStack</A> module to
! 271: find the best way to pass the information to the user. <P>
! 272:
! 273: The <CODE>HTPresentation</CODE> structure contains both
! 274: <CODE>converters</CODE> and <CODE>presenters</CODE>, and it is defined
! 275: as:
! 276:
! 277: <PRE>
! 278: typedef struct _HTPresentation {
! 279: HTAtom* rep; /* representation name atomized */
! 280: HTAtom* rep_out; /* resulting representation */
! 281: HTConverter *converter; /* The routine to gen the stream stack */
! 282: char * command; /* MIME-format string */
! 283: char * test_command; /* MIME-format string */
! 284: float quality; /* Between 0 (bad) and 1 (good) */
2.1 timbl 285: float secs;
286: float secs_per_byte;
2.31 ! frystyk 287: } HTPresentation;
2.28 frystyk 288: </PRE>
289:
2.1 timbl 290:
2.31 ! frystyk 291: <A NAME="z17"><H3>Global List of Converters</H3>
2.1 timbl 292:
2.31 ! frystyk 293: This module keeps a global list of converters. This can be used to get
! 294: the set of supported formats. <P>
2.12 timbl 295:
2.31 ! frystyk 296: <B>NOTE:</B> There is also a conversion list associated with each
! 297: request in the <A HREF="HTAccess.html#z1">HTRequest structure</A>.
! 298:
! 299: <PRE>
! 300: extern HTList * HTConversions;
2.1 timbl 301: </PRE>
2.31 ! frystyk 302:
! 303: <H3>Register a Presenter</H3>
! 304:
2.1 timbl 305: <DL>
2.31 ! frystyk 306: <DT>conversions
! 307: <DD>The list of <CODE>conveters</CODE> and <CODE>presenters</CODE>
! 308: <DT>representation
! 309: <DD>the <A HREF="#Type">MIME-style</A> format name
2.1 timbl 310: <DT>command
2.31 ! frystyk 311: <DD>the MAILCAP-style command template
2.1 timbl 312: <DT>quality
2.31 ! frystyk 313: <DD>A degradation faction [0..1]
2.1 timbl 314: <DT>maxbytes
2.31 ! frystyk 315: <DD>A limit on the length acceptable as input (0 infinite)
2.1 timbl 316: <DT>maxsecs
2.31 ! frystyk 317: <DD>A limit on the time user will wait (0 for infinity)
2.1 timbl 318: </DL>
319:
2.31 ! frystyk 320: <PRE>
! 321: extern void HTSetPresentation PARAMS((HTList * conversions,
! 322: CONST char * representation,
! 323: CONST char * command,
! 324: CONST char * test_command,
! 325: float quality,
! 326: float secs,
! 327: float secs_per_byte));
! 328: </PRE>
2.1 timbl 329:
2.31 ! frystyk 330: <H3>Register a Converter</H3>
2.1 timbl 331:
332: <DL>
2.31 ! frystyk 333: <DT>conversions
! 334: <DD>The list of <CODE>conveters</CODE> and <CODE>presenters</CODE>
2.1 timbl 335: <DT>rep_in
2.31 ! frystyk 336: <DD>the <A HREF="#Type">MIME-style</A> format name
2.1 timbl 337: <DT>rep_out
2.31 ! frystyk 338: <DD>is the resulting content-type after the conversion
2.1 timbl 339: <DT>converter
2.31 ! frystyk 340: <DD>is the routine to call which actually does the conversion
! 341: <DT>quality
! 342: <DD>A degradation faction [0..1]
! 343: <DT>maxbytes
! 344: <DD>A limit on the length acceptable as input (0 infinite)
! 345: <DT>maxsecs
! 346: <DD>A limit on the time user will wait (0 for infinity)
2.1 timbl 347: </DL>
348:
349: <PRE>
2.31 ! frystyk 350: extern void HTSetConversion PARAMS((HTList * conversions,
! 351: CONST char * rep_in,
! 352: CONST char * rep_out,
! 353: HTConverter * converter,
! 354: float quality,
! 355: float secs,
! 356: float secs_per_byte));
! 357: </PRE>
! 358:
! 359: <H3>Set up Default Presenters and Converters</H3>
! 360:
! 361: A default set of <CODE>converters</CODE> and <CODE>presenters</CODE>
! 362: are defined in <A HREF="HTInit.c">HTInit.c</A> (or <A
! 363: HREF="../../Daemon/Inplementation/HTSUtils.c">HTSInit.c</A> in the
! 364: server). <P>
! 365:
! 366: <B>NOTE: </B> No automatic initialization is done in the Library, so
! 367: this is for the application to do
! 368:
! 369: <PRE>
! 370: extern void HTFormatInit PARAMS((HTList * conversions));
! 371: </PRE>
! 372:
! 373: This function also exists in a version where no
! 374: <CODE>presenters</CODE> are initialized. This is intended for Non
! 375: Interactive Mode, e.g., in the Line Mode Browser.
! 376:
! 377: <PRE>
! 378: extern void HTFormatInitNIM PARAMS((HTList * conversions));
! 379: </PRE>
! 380:
! 381: <H3>Remove presentations and conversions</H3>
2.1 timbl 382:
2.31 ! frystyk 383: This function deletes the list of <CODE>converters</CODE> and
! 384: <CODE>presenters</CODE>.
2.1 timbl 385:
2.31 ! frystyk 386: <PRE>
! 387: extern void HTFormatDelete PARAMS((HTRequest * request));
! 388: </PRE>
! 389:
! 390:
! 391: <A NAME="Rank"><H2>Ranking of Accepted Formats</H2></A>
! 392:
! 393: This function is not currently used.
! 394:
! 395: <PRE>
! 396: PUBLIC BOOL HTRank PARAMS((HTList * possibilities,
! 397: HTList * accepted_content_types,
! 398: HTList * accepted_content_languages,
! 399: HTList * accepted_content_encodings));
2.1 timbl 400: </PRE>
2.31 ! frystyk 401:
! 402: <H2><A NAME="z3">HTStreamStack</A></H2>
! 403:
! 404: This is the routine which actually sets up the conversion. It
! 405: currently checks only for direct conversions, but multi-stage
! 406: conversions are forseen. It takes a stream into which the output
! 407: should be sent in the final format, builds the conversion stack, and
! 408: returns a stream into which the data in the input format should be
! 409: fed.<P>
! 410:
2.23 luotonen 411: If <CODE>guess</CODE> is true and input format is
2.31 ! frystyk 412: <CODE>www/unknown</CODE>, try to guess the format by looking at the
! 413: first few bytes of the stream. <P>
2.1 timbl 414:
2.31 ! frystyk 415: <PRE>
! 416: PUBLIC HTStream * HTStreamStack PARAMS((HTFormat rep_in,
! 417: HTFormat rep_out,
! 418: HTStream * output_stream,
! 419: HTRequest * request,
! 420: BOOL guess));
2.1 timbl 421: </PRE>
2.31 ! frystyk 422:
! 423: <H3>Find the cost of a filter stack</H3>
! 424:
! 425: Must return the cost of the same stack which HTStreamStack would set
2.1 timbl 426: up.
427:
2.31 ! frystyk 428: <PRE>
2.1 timbl 429: #define NO_VALUE_FOUND -1e20 /* returned if none found */
430:
2.31 ! frystyk 431: extern float HTStackValue PARAMS((HTList * conversions,
! 432: HTFormat format_in,
! 433: HTFormat format_out,
! 434: float initial_value,
! 435: long int length));
2.1 timbl 436: </PRE>
2.31 ! frystyk 437:
! 438: <HR>
! 439:
! 440: <B><IMG
! 441: ALIGN=middle SRC="http://info.cern.ch/hypertext/WWW/Icons/32x32/caution.gif"> ATTENTION <IMG ALIGN=middle SRC="http://info.cern.ch/hypertext/WWW/Icons/32x32/caution.gif"> <P>
! 442:
! 443: THE REST OF THE FUNCTION DEFINED IN THIS MODULE ARE GOING TO BE
! 444: OBSOLETE SO DO NOT USE THEM - THEY ARE NOT REENTRANT.</B>
! 445:
! 446: <HR>
! 447:
2.1 timbl 448: <H2><A
2.10 timbl 449: NAME="z1">HTCopy: Copy a socket to a stream</A></H2>This is used by the protocol engines
2.6 secret 450: to send data down a stream, typically
2.22 timbl 451: one which has been generated by HTStreamStack.
452: Returns the number of bytes transferred.
2.19 luotonen 453: <PRE>extern int HTCopy PARAMS((
2.1 timbl 454: int file_number,
455: HTStream* sink));
456:
457:
2.6 secret 458: </PRE>
459: <H2><A
2.10 timbl 460: NAME="c6">HTFileCopy: Copy a file to a stream</A></H2>This is used by the protocol engines
2.6 secret 461: to send data down a stream, typically
2.7 timbl 462: one which has been generated by HTStreamStack.
463: It is currently called by <A
2.12 timbl 464: NAME="z9" HREF="#c7">HTParseFile</A>
2.6 secret 465: <PRE>extern void HTFileCopy PARAMS((
466: FILE* fp,
467: HTStream* sink));
468:
469:
2.7 timbl 470: </PRE>
471: <H2><A
2.10 timbl 472: NAME="c2">HTCopyNoCR: Copy a socket to a stream,
2.7 timbl 473: stripping CR characters.</A></H2>It is slower than <A
2.12 timbl 474: NAME="z2" HREF="#z1">HTCopy</A> .
2.1 timbl 475: <PRE>
476: extern void HTCopyNoCR PARAMS((
477: int file_number,
478: HTStream* sink));
479:
2.16 luotonen 480:
481: </PRE>
2.1 timbl 482: <H2>HTParseSocket: Parse a socket given
483: its format</H2>This routine is called by protocol
484: modules to load an object. uses<A
2.12 timbl 485: NAME="z4" HREF="#z3">
2.1 timbl 486: HTStreamStack</A> and the copy routines
487: above. Returns HT_LOADED if succesful,
488: <0 if not.
489: <PRE>extern int HTParseSocket PARAMS((
490: HTFormat format_in,
491: int file_number,
2.13 timbl 492: HTRequest * request));
2.6 secret 493:
494: </PRE>
495: <H2><A
2.10 timbl 496: NAME="c1">HTParseFile: Parse a File through
2.7 timbl 497: a file pointer</A></H2>This routine is called by protocols
498: modules to load an object. uses<A
2.12 timbl 499: NAME="z4" HREF="#z3"> HTStreamStack</A>
2.7 timbl 500: and <A
2.12 timbl 501: NAME="c7" HREF="#c6">HTFileCopy</A> . Returns HT_LOADED
2.7 timbl 502: if succesful, <0 if not.
2.6 secret 503: <PRE>extern int HTParseFile PARAMS((
504: HTFormat format_in,
505: FILE *fp,
2.13 timbl 506: HTRequest * request));
2.8 timbl 507:
508: </PRE>
2.21 frystyk 509:
2.31 ! frystyk 510:
! 511: <H3>Get next character from buffer</H3>
! 512:
! 513: <PRE>extern int HTInputSocket_getCharacter PARAMS((HTInputSocket* isoc));
2.21 frystyk 514: </PRE>
515:
2.31 ! frystyk 516: <H3>Read block from input socket</H3>Read *len characters and return a
! 517: buffer (don't free) containing *len
! 518: characters ( *len may have changed).
! 519: Buffer is not NULL-terminated.
! 520: <PRE>extern char * HTInputSocket_getBlock PARAMS((HTInputSocket * isoc,
! 521: int * len));
! 522:
! 523: PUBLIC char * HTInputSocket_getLine PARAMS((HTInputSocket * isoc));
! 524: PUBLIC char * HTInputSocket_getUnfoldedLine PARAMS((HTInputSocket * isoc));
! 525: PUBLIC char * HTInputSocket_getStatusLine PARAMS((HTInputSocket * isoc));
! 526: PUBLIC BOOL HTInputSocket_seemsBinary PARAMS((HTInputSocket * isoc));
! 527:
2.1 timbl 528: #endif
529:
2.31 ! frystyk 530: </PRE>
! 531:
! 532:
! 533: End of definition module
! 534:
! 535: </BODY>
2.10 timbl 536: </HTML>
Webmaster