Annotation of libwww/Library/src/HTFormat.html, revision 2.10

2.10    ! timbl       1: <HTML>
        !             2: <HEAD>
2.1       timbl       3: <TITLE>HTFormat: The format manager in the WWW Library</TITLE>
2.10    ! timbl       4: <NEXTID N="z11">
        !             5: </HEAD>
2.1       timbl       6: <BODY>
                      7: <H1>Manage different document formats</H1>Here we describe the functions of
                      8: the HTFormat module which handles
                      9: conversion between different data
                     10: representations.  (In MIME parlance,
                     11: a representation is known as a content-type.
2.2       timbl      12: In WWW  the term "format" is often
2.1       timbl      13: used as it is shorter).<P>
                     14: This module is implemented by <A
2.10    ! timbl      15: NAME="z0" HREF="HTFormat.c">HTFormat.c</A>
2.7       timbl      16: . This hypertext document is used
                     17: to generate the <A
2.10    ! timbl      18: NAME="z8" HREF="HTFormat.h">HTFormat.h</A> inlude
2.9       timbl      19: file.  Part of the <A
2.10    ! timbl      20: NAME="z10" HREF="Overview.html">WWW library</A> .
2.1       timbl      21: <H2>Preamble</H2>
                     22: <PRE>#ifndef HTFORMAT_H
                     23: #define HTFORMAT_H
                     24: 
                     25: #include "HTUtils.h"
                     26: #include <A
2.10    ! timbl      27: NAME="z7" HREF="HTStream.html">"HTStream.h"</A>
2.1       timbl      28: #include "HTAtom.h"
2.2       timbl      29: #include "HTList.h"
2.1       timbl      30: 
                     31: #ifdef SHORT_NAMES
                     32: #define HTOutputSource HTOuSour
                     33: #define HTOutputBinary HTOuBina
                     34: #endif
                     35: 
                     36: </PRE>
                     37: <H2>The HTFormat type</H2>We use the HTAtom object for holding
                     38: representations. This allows faster
                     39: manipulation (comparison and copying)
                     40: that if we stayed with strings.
                     41: <PRE>typedef HTAtom * HTFormat;
                     42:                        
                     43: </PRE>These macros (which used to be constants)
                     44: define some basic internally referenced
2.2       timbl      45: representations.  The www/xxx ones
2.1       timbl      46: are of course not MIME standard.<P>
                     47: www/source  is an output format which
                     48: leaves the input untouched. It is
                     49: useful for diagnostics, and for users
                     50: who want to see the original, whatever
                     51: it is.
                     52: <PRE>                  /* Internal ones */
                     53: #define WWW_SOURCE HTAtom_for("www/source")    /* Whatever it was originally*/
                     54: 
                     55: </PRE>www/present represents the user's
                     56: perception of the document.  If you
                     57: convert to www/present, you present
                     58: the material to the user. 
                     59: <PRE>#define WWW_PRESENT HTAtom_for("www/present")     /* The user's perception */
                     60: 
                     61: </PRE>The message/rfc822 format means a
                     62: MIME message or a plain text message
                     63: with no MIME header. This is what
                     64: is returned by an HTTP server.
                     65: <PRE>#define WWW_MIME HTAtom_for("www/mime")           /* A MIME message */
2.10    ! timbl      66: 
2.1       timbl      67: </PRE>www/print is like www/present except
                     68: it represents a printed copy.
                     69: <PRE>#define WWW_PRINT HTAtom_for("www/print") /* A printed copy */
                     70: 
2.10    ! timbl      71: </PRE>www/unknown is a really unknown type.
        !            72:  Some default action is appropriate.
        !            73: <PRE>#define WWW_UNKNOWN     HTAtom_for("www/unknown")
        !            74: 
        !            75: </PRE>These are regular MIME types:
        !            76: <PRE>#define WWW_PLAINTEXT     HTAtom_for("text/plain")
2.1       timbl      77: #define WWW_POSTSCRIPT         HTAtom_for("application/postscript")
                     78: #define WWW_RICHTEXT   HTAtom_for("application/rtf")
2.10    ! timbl      79: #define WWW_AUDIO       HTAtom_for("audio/basic")
2.1       timbl      80: #define WWW_HTML       HTAtom_for("text/html")
                     81: #define WWW_BINARY     HTAtom_for("application/binary")
2.7       timbl      82: 
2.1       timbl      83: </PRE>We must include the following file
                     84: after defining HTFormat, to which
2.10    ! timbl      85: it makes reference.
        !            86: <H2>The HTEncoding type</H2>
        !            87: <PRE>typedef HTAtom* HTEncoding;
        !            88: 
        !            89: </PRE>The following are values for the
        !            90: MIME types:
        !            91: <PRE>#define WWW_ENC_7BIT              HTAtom_for("7bit")
        !            92: #define WWW_ENC_8BIT           HTAtom_for("8bit")
        !            93: #define WWW_ENC_BINARY         HTAtom_for("binary")
        !            94: 
        !            95: </PRE>We also add
        !            96: <PRE>#define WWW_ENC_COMPRESS  HTAtom_for("compress")
        !            97: 
        !            98: #include "HTAnchor.h"
2.1       timbl      99: 
                    100: </PRE>
                    101: <H2>The HTPresentation and HTConverter
                    102: types</H2>This HTPresentation structure represents
                    103: a possible conversion algorithm from
                    104: one format to annother.  It includes
                    105: a pointer to a conversion routine.
                    106: The conversion routine returns a
                    107: stream to which data should be fed.
                    108: See also <A
2.10    ! timbl     109: NAME="z5" HREF="#z3">HTStreamStack</A> which scans
2.1       timbl     110: the list of registered converters
                    111: and calls one. See the <A
2.10    ! timbl     112: NAME="z6" HREF="HTInit.html">initialisation
2.1       timbl     113: module</A> for a list of conversion routines.
                    114: <PRE>typedef struct _HTPresentation HTPresentation;
                    115: 
2.2       timbl     116: typedef HTStream * HTConverter PARAMS((
2.1       timbl     117:        HTPresentation *        pres,
                    118:        HTParentAnchor *        anchor,
                    119:        HTStream *              sink));
                    120:        
                    121: struct _HTPresentation {
                    122:        HTAtom* rep;            /* representation name atmoized */
                    123:        HTAtom* rep_out;        /* resulting representation */
2.2       timbl     124:        HTConverter *converter; /* The routine to gen the stream stack */
2.1       timbl     125:        char *  command;        /* MIME-format string */
                    126:        float   quality;        /* Between 0 (bad) and 1 (good) */
                    127:        float   secs;
                    128:        float   secs_per_byte;
                    129: };
                    130: 
                    131: </PRE>The list of presentations is kept
                    132: by this module.  It is also scanned
                    133: by modules which want to know the
                    134: set of formats supported. for example.
                    135: <PRE>extern HTList * HTPresentations;
                    136: 
                    137: </PRE>
                    138: <H2>HTSetPresentation: Register a system
                    139: command to present a format</H2>
2.8       timbl     140: <H3>On entry,</H3>
2.1       timbl     141: <DL>
                    142: <DT>rep
                    143: <DD> is the MIME - style format name
                    144: <DT>command
                    145: <DD> is the MAILCAP - style command
                    146: template
                    147: <DT>quality
                    148: <DD> A degradation faction 0..1
                    149: <DT>maxbytes
                    150: <DD> A limit on the length acceptable
                    151: as input (0 infinite)
                    152: <DT>maxsecs
                    153: <DD> A limit on the time user
                    154: will wait (0 for infinity)
                    155: </DL>
                    156: 
                    157: <PRE>extern void HTSetPresentation PARAMS((
                    158:        CONST char * representation,
                    159:        CONST char * command,
                    160:        float   quality,
                    161:        float   secs, 
                    162:        float   secs_per_byte
                    163: ));
                    164: 
                    165: 
                    166: </PRE>
                    167: <H2>HTSetConversion:   Register a converstion
                    168: routine</H2>
2.8       timbl     169: <H3>On entry,</H3>
2.1       timbl     170: <DL>
                    171: <DT>rep_in
                    172: <DD> is the content-type input
                    173: <DT>rep_out
                    174: <DD> is the resulting content-type
                    175: <DT>converter
                    176: <DD> is the routine to make
                    177: the stream to do it
                    178: </DL>
                    179: 
                    180: <PRE>
                    181: extern void HTSetConversion PARAMS((
                    182:        CONST char *    rep_in,
                    183:        CONST char *    rep_out,
2.2       timbl     184:        HTConverter *   converter,
2.1       timbl     185:        float           quality,
                    186:        float           secs, 
                    187:        float           secs_per_byte
                    188: ));
                    189: 
                    190: 
                    191: </PRE>
                    192: <H2><A
2.10    ! timbl     193: NAME="z3">HTStreamStack:   Create a stack of
2.1       timbl     194: streams</A></H2>This is the routine which actually
                    195: sets up the conversion. It currently
                    196: checks only for direct conversions,
2.8       timbl     197: but multi-stage conversions are forseen.
2.2       timbl     198: It takes a stream into which the
2.1       timbl     199: output should be sent in the final
                    200: format, builds the conversion stack,
                    201: and returns a stream into which the
                    202: data in the input format should be
                    203: fed.  The anchor is passed because
                    204: hypertxet objects load information
                    205: into the anchor object which represents
                    206: them.
                    207: <PRE>extern HTStream * HTStreamStack PARAMS((
                    208:        HTFormat                format_in,
                    209:        HTFormat                format_out,
                    210:        HTStream*               stream_out,
                    211:        HTParentAnchor*         anchor));
                    212: 
                    213: </PRE>
                    214: <H2>HTStackValue: Find the cost of a
                    215: filter stack</H2>Must return the cost of the same
                    216: stack which HTStreamStack would set
                    217: up.
2.8       timbl     218: <H3>On entry,</H3>
2.1       timbl     219: <DL>
                    220: <DT>format_in
                    221: <DD> The fomat of the data to
                    222: be converted
                    223: <DT>format_out
                    224: <DD> The format required
                    225: <DT>initial_value
                    226: <DD> The intrinsic "value"
                    227: of the data before conversion on
                    228: a scale from 0 to 1
                    229: <DT>length
                    230: <DD> The number of bytes expected
                    231: in the input format
                    232: </DL>
                    233: 
                    234: <PRE>extern float HTStackValue PARAMS((
                    235:        HTFormat                format_in,
                    236:        HTFormat                rep_out,
                    237:        float                   initial_value,
                    238:        long int                length));
                    239: 
                    240: #define NO_VALUE_FOUND -1e20           /* returned if none found */
                    241: 
                    242: </PRE>
                    243: <H2><A
2.10    ! timbl     244: NAME="z1">HTCopy:  Copy a socket to a stream</A></H2>This is used by the protocol engines
2.6       secret    245: to send data down a stream, typically
2.1       timbl     246: one which has been generated by HTStreamStack.
                    247: <PRE>extern void HTCopy PARAMS((
                    248:        int                     file_number,
                    249:        HTStream*               sink));
                    250: 
                    251:        
2.6       secret    252: </PRE>
                    253: <H2><A
2.10    ! timbl     254: NAME="c6">HTFileCopy:  Copy a file to a stream</A></H2>This is used by the protocol engines
2.6       secret    255: to send data down a stream, typically
2.7       timbl     256: one which has been generated by HTStreamStack.
                    257: It is currently called by <A
2.10    ! timbl     258: NAME="z9" HREF="#c7">HTParseFile</A>
2.6       secret    259: <PRE>extern void HTFileCopy PARAMS((
                    260:        FILE*                   fp,
                    261:        HTStream*               sink));
                    262: 
                    263:        
2.7       timbl     264: </PRE>
                    265: <H2><A
2.10    ! timbl     266: NAME="c2">HTCopyNoCR: Copy a socket to a stream,
2.7       timbl     267: stripping CR characters.</A></H2>It is slower than <A
2.10    ! timbl     268: NAME="z2" HREF="#z1">HTCopy</A> .
2.1       timbl     269: <PRE>
                    270: extern void HTCopyNoCR PARAMS((
                    271:        int                     file_number,
                    272:        HTStream*               sink));
                    273: 
                    274: 
                    275: </PRE>
                    276: <H2>Clear input buffer and set file number</H2>This routine and the one below provide
                    277: simple character input from sockets.
                    278: (They are left over from the older
                    279: architecure and may not be used very
                    280: much.)  The existence of a common
                    281: routine and buffer saves memory space
                    282: in small implementations.
                    283: <PRE>extern void HTInitInput PARAMS((int file_number));
                    284: 
                    285: </PRE>
                    286: <H2>Get next character from buffer</H2>
                    287: <PRE>extern char HTGetChararcter NOPARAMS;
                    288: 
                    289: 
                    290: </PRE>
                    291: <H2>HTParseSocket: Parse a socket given
                    292: its format</H2>This routine is called by protocol
                    293: modules to load an object.  uses<A
2.10    ! timbl     294: NAME="z4" HREF="#z3">
2.1       timbl     295: HTStreamStack</A> and the copy routines
                    296: above.  Returns HT_LOADED if succesful,
                    297: &lt;0 if not.
                    298: <PRE>extern int HTParseSocket PARAMS((
                    299:        HTFormat        format_in,
                    300:        HTFormat        format_out,
                    301:        HTParentAnchor  *anchor,
                    302:        int             file_number,
2.6       secret    303:        HTStream*       sink));
                    304: 
                    305: </PRE>
                    306: <H2><A
2.10    ! timbl     307: NAME="c1">HTParseFile: Parse a File through
2.7       timbl     308: a file pointer</A></H2>This routine is called by protocols
                    309: modules to load an object. uses<A
2.10    ! timbl     310: NAME="z4" HREF="#z3"> HTStreamStack</A>
2.7       timbl     311: and <A
2.10    ! timbl     312: NAME="c7" HREF="#c6">HTFileCopy</A> .  Returns HT_LOADED
2.7       timbl     313: if succesful, &lt;0 if not.
2.6       secret    314: <PRE>extern int HTParseFile PARAMS((
                    315:        HTFormat        format_in,
                    316:        HTFormat        format_out,
                    317:        HTParentAnchor  *anchor,
                    318:        FILE            *fp,
2.1       timbl     319:        HTStream*       sink));
2.8       timbl     320: 
                    321: </PRE>
                    322: <H2>HTFormatInit: Set up default presentations
                    323: and conversions</H2>These are defined in HTInit.c or
                    324: HTSInit.c if these have been replaced.
                    325: If you don't call this routine, and
                    326: you don't define any presentations,
                    327: then this routine will automatically
                    328: be called the first time a conversion
                    329: is needed. However, if you explicitly
                    330: add some conversions (eg using HTLoadRules)
                    331: then you may want also to explicitly
                    332: call this to get the defaults as
                    333: well.
                    334: <PRE>extern void HTFormatInit NOPARAMS;
2.1       timbl     335: 
                    336: </PRE>
                    337: <H2>Epilogue</H2>
                    338: <PRE>extern BOOL HTOutputSource;       /* Flag: shortcut parser */
                    339: #endif
                    340: 
2.7       timbl     341: </PRE>end</A></BODY>
2.10    ! timbl     342: </HTML>

Webmaster