Annotation of libwww/Library/src/HTFormat.html, revision 2.9

2.1       timbl       1: <HEADER>
                      2: <TITLE>HTFormat: The format manager in the WWW Library</TITLE>
2.9     ! timbl       3: <NEXTID N="11">
2.1       timbl       4: </HEADER>
                      5: <BODY>
                      6: <H1>Manage different document formats</H1>Here we describe the functions of
                      7: the HTFormat module which handles
                      8: conversion between different data
                      9: representations.  (In MIME parlance,
                     10: a representation is known as a content-type.
2.2       timbl      11: In WWW  the term "format" is often
2.1       timbl      12: used as it is shorter).<P>
                     13: This module is implemented by <A
2.7       timbl      14: NAME=z0 HREF="HTFormat.c">HTFormat.c</A>
                     15: . This hypertext document is used
                     16: to generate the <A
                     17: NAME=z8 HREF="HTFormat.h">HTFormat.h</A> inlude
2.9     ! timbl      18: file.  Part of the <A
        !            19: NAME=z10 HREF="Overview.html">WWW library</A>.
2.1       timbl      20: <H2>Preamble</H2>
                     21: <PRE>#ifndef HTFORMAT_H
                     22: #define HTFORMAT_H
                     23: 
                     24: #include "HTUtils.h"
                     25: #include <A
                     26: NAME=z7 HREF="HTStream.html">"HTStream.h"</A>
                     27: #include "HTAtom.h"
2.2       timbl      28: #include "HTList.h"
2.1       timbl      29: 
                     30: #ifdef SHORT_NAMES
                     31: #define HTOutputSource HTOuSour
                     32: #define HTOutputBinary HTOuBina
                     33: #endif
                     34: 
                     35: </PRE>
                     36: <H2>The HTFormat type</H2>We use the HTAtom object for holding
                     37: representations. This allows faster
                     38: manipulation (comparison and copying)
                     39: that if we stayed with strings.
                     40: <PRE>typedef HTAtom * HTFormat;
                     41:                        
                     42: </PRE>These macros (which used to be constants)
                     43: define some basic internally referenced
2.2       timbl      44: representations.  The www/xxx ones
2.1       timbl      45: are of course not MIME standard.<P>
                     46: www/source  is an output format which
                     47: leaves the input untouched. It is
                     48: useful for diagnostics, and for users
                     49: who want to see the original, whatever
                     50: it is.
                     51: <PRE>                  /* Internal ones */
                     52: #define WWW_SOURCE HTAtom_for("www/source")    /* Whatever it was originally*/
                     53: 
                     54: </PRE>www/present represents the user's
                     55: perception of the document.  If you
                     56: convert to www/present, you present
                     57: the material to the user. 
                     58: <PRE>#define WWW_PRESENT HTAtom_for("www/present")     /* The user's perception */
                     59: 
                     60: </PRE>The message/rfc822 format means a
                     61: MIME message or a plain text message
                     62: with no MIME header. This is what
                     63: is returned by an HTTP server.
                     64: <PRE>#define WWW_MIME HTAtom_for("www/mime")           /* A MIME message */
                     65: </PRE>www/print is like www/present except
                     66: it represents a printed copy.
                     67: <PRE>#define WWW_PRINT HTAtom_for("www/print") /* A printed copy */
                     68: 
                     69: #define WWW_PLAINTEXT  HTAtom_for("text/plain")
                     70: #define WWW_POSTSCRIPT         HTAtom_for("application/postscript")
                     71: #define WWW_RICHTEXT   HTAtom_for("application/rtf")
                     72: #define WWW_HTML       HTAtom_for("text/html")
                     73: #define WWW_BINARY     HTAtom_for("application/binary")
2.7       timbl      74: 
2.1       timbl      75: </PRE>We must include the following file
                     76: after defining HTFormat, to which
2.9     ! timbl      77: it makes reference.<P>
        !            78: The HTEncoding type<P>
        !            79: typedef HTAtom* HTEncoding;<P>
        !            80: The following are values for the
        !            81: MIME types:<P>
        !            82: #define WWW_ENC_7BIT<P>
        !            83: #define WWW_ENC_8BIT<P>
        !            84: #define WWW_ENC_BINARY<P>
        !            85: We also add
2.1       timbl      86: <PRE>#include "HTAnchor.h"
                     87: 
                     88: </PRE>
                     89: <H2>The HTPresentation and HTConverter
                     90: types</H2>This HTPresentation structure represents
                     91: a possible conversion algorithm from
                     92: one format to annother.  It includes
                     93: a pointer to a conversion routine.
                     94: The conversion routine returns a
                     95: stream to which data should be fed.
                     96: See also <A
                     97: NAME=z5 HREF="#z3">HTStreamStack</A> which scans
                     98: the list of registered converters
                     99: and calls one. See the <A
                    100: NAME=z6 HREF="HTInit.html">initialisation
                    101: module</A> for a list of conversion routines.
                    102: <PRE>typedef struct _HTPresentation HTPresentation;
                    103: 
2.2       timbl     104: typedef HTStream * HTConverter PARAMS((
2.1       timbl     105:        HTPresentation *        pres,
                    106:        HTParentAnchor *        anchor,
                    107:        HTStream *              sink));
                    108:        
                    109: struct _HTPresentation {
                    110:        HTAtom* rep;            /* representation name atmoized */
                    111:        HTAtom* rep_out;        /* resulting representation */
2.2       timbl     112:        HTConverter *converter; /* The routine to gen the stream stack */
2.1       timbl     113:        char *  command;        /* MIME-format string */
                    114:        float   quality;        /* Between 0 (bad) and 1 (good) */
                    115:        float   secs;
                    116:        float   secs_per_byte;
                    117: };
                    118: 
                    119: </PRE>The list of presentations is kept
                    120: by this module.  It is also scanned
                    121: by modules which want to know the
                    122: set of formats supported. for example.
                    123: <PRE>extern HTList * HTPresentations;
                    124: 
                    125: </PRE>
                    126: <H2>HTSetPresentation: Register a system
                    127: command to present a format</H2>
2.8       timbl     128: <H3>On entry,</H3>
2.1       timbl     129: <DL>
                    130: <DT>rep
                    131: <DD> is the MIME - style format name
                    132: <DT>command
                    133: <DD> is the MAILCAP - style command
                    134: template
                    135: <DT>quality
                    136: <DD> A degradation faction 0..1
                    137: <DT>maxbytes
                    138: <DD> A limit on the length acceptable
                    139: as input (0 infinite)
                    140: <DT>maxsecs
                    141: <DD> A limit on the time user
                    142: will wait (0 for infinity)
                    143: </DL>
                    144: 
                    145: <PRE>extern void HTSetPresentation PARAMS((
                    146:        CONST char * representation,
                    147:        CONST char * command,
                    148:        float   quality,
                    149:        float   secs, 
                    150:        float   secs_per_byte
                    151: ));
                    152: 
                    153: 
                    154: </PRE>
                    155: <H2>HTSetConversion:   Register a converstion
                    156: routine</H2>
2.8       timbl     157: <H3>On entry,</H3>
2.1       timbl     158: <DL>
                    159: <DT>rep_in
                    160: <DD> is the content-type input
                    161: <DT>rep_out
                    162: <DD> is the resulting content-type
                    163: <DT>converter
                    164: <DD> is the routine to make
                    165: the stream to do it
                    166: </DL>
                    167: 
                    168: <PRE>
                    169: extern void HTSetConversion PARAMS((
                    170:        CONST char *    rep_in,
                    171:        CONST char *    rep_out,
2.2       timbl     172:        HTConverter *   converter,
2.1       timbl     173:        float           quality,
                    174:        float           secs, 
                    175:        float           secs_per_byte
                    176: ));
                    177: 
                    178: 
                    179: </PRE>
                    180: <H2><A
                    181: NAME=z3>HTStreamStack:   Create a stack of
                    182: streams</A></H2>This is the routine which actually
                    183: sets up the conversion. It currently
                    184: checks only for direct conversions,
2.8       timbl     185: but multi-stage conversions are forseen.
2.2       timbl     186: It takes a stream into which the
2.1       timbl     187: output should be sent in the final
                    188: format, builds the conversion stack,
                    189: and returns a stream into which the
                    190: data in the input format should be
                    191: fed.  The anchor is passed because
                    192: hypertxet objects load information
                    193: into the anchor object which represents
                    194: them.
                    195: <PRE>extern HTStream * HTStreamStack PARAMS((
                    196:        HTFormat                format_in,
                    197:        HTFormat                format_out,
                    198:        HTStream*               stream_out,
                    199:        HTParentAnchor*         anchor));
                    200: 
                    201: </PRE>
                    202: <H2>HTStackValue: Find the cost of a
                    203: filter stack</H2>Must return the cost of the same
                    204: stack which HTStreamStack would set
                    205: up.
2.8       timbl     206: <H3>On entry,</H3>
2.1       timbl     207: <DL>
                    208: <DT>format_in
                    209: <DD> The fomat of the data to
                    210: be converted
                    211: <DT>format_out
                    212: <DD> The format required
                    213: <DT>initial_value
                    214: <DD> The intrinsic "value"
                    215: of the data before conversion on
                    216: a scale from 0 to 1
                    217: <DT>length
                    218: <DD> The number of bytes expected
                    219: in the input format
                    220: </DL>
                    221: 
                    222: <PRE>extern float HTStackValue PARAMS((
                    223:        HTFormat                format_in,
                    224:        HTFormat                rep_out,
                    225:        float                   initial_value,
                    226:        long int                length));
                    227: 
                    228: #define NO_VALUE_FOUND -1e20           /* returned if none found */
                    229: 
                    230: </PRE>
                    231: <H2><A
                    232: NAME=z1>HTCopy:  Copy a socket to a stream</A></H2>This is used by the protocol engines
2.6       secret    233: to send data down a stream, typically
2.1       timbl     234: one which has been generated by HTStreamStack.
                    235: <PRE>extern void HTCopy PARAMS((
                    236:        int                     file_number,
                    237:        HTStream*               sink));
                    238: 
                    239:        
2.6       secret    240: </PRE>
                    241: <H2><A
                    242: NAME=c6>HTFileCopy:  Copy a file to a stream</A></H2>This is used by the protocol engines
                    243: to send data down a stream, typically
2.7       timbl     244: one which has been generated by HTStreamStack.
                    245: It is currently called by <A
                    246: NAME=z9 HREF="#c7">HTParseFile</A>
2.6       secret    247: <PRE>extern void HTFileCopy PARAMS((
                    248:        FILE*                   fp,
                    249:        HTStream*               sink));
                    250: 
                    251:        
2.7       timbl     252: </PRE>
                    253: <H2><A
                    254: NAME=c2>HTCopyNoCR: Copy a socket to a stream,
                    255: stripping CR characters.</A></H2>It is slower than <A
2.1       timbl     256: NAME=z2 HREF="#z1">HTCopy</A> .
                    257: <PRE>
                    258: extern void HTCopyNoCR PARAMS((
                    259:        int                     file_number,
                    260:        HTStream*               sink));
                    261: 
                    262: 
                    263: </PRE>
                    264: <H2>Clear input buffer and set file number</H2>This routine and the one below provide
                    265: simple character input from sockets.
                    266: (They are left over from the older
                    267: architecure and may not be used very
                    268: much.)  The existence of a common
                    269: routine and buffer saves memory space
                    270: in small implementations.
                    271: <PRE>extern void HTInitInput PARAMS((int file_number));
                    272: 
                    273: </PRE>
                    274: <H2>Get next character from buffer</H2>
                    275: <PRE>extern char HTGetChararcter NOPARAMS;
                    276: 
                    277: 
                    278: </PRE>
                    279: <H2>HTParseSocket: Parse a socket given
                    280: its format</H2>This routine is called by protocol
                    281: modules to load an object.  uses<A
                    282: NAME=z4 HREF="#z3">
                    283: HTStreamStack</A> and the copy routines
                    284: above.  Returns HT_LOADED if succesful,
                    285: &lt;0 if not.
                    286: <PRE>extern int HTParseSocket PARAMS((
                    287:        HTFormat        format_in,
                    288:        HTFormat        format_out,
                    289:        HTParentAnchor  *anchor,
                    290:        int             file_number,
2.6       secret    291:        HTStream*       sink));
                    292: 
                    293: </PRE>
                    294: <H2><A
2.7       timbl     295: NAME=c1>HTParseFile: Parse a File through
                    296: a file pointer</A></H2>This routine is called by protocols
                    297: modules to load an object. uses<A
                    298: NAME=z4 HREF="#z3"> HTStreamStack</A>
                    299: and <A
                    300: NAME=c7 HREF="#c6">HTFileCopy</A> .  Returns HT_LOADED
                    301: if succesful, &lt;0 if not.
2.6       secret    302: <PRE>extern int HTParseFile PARAMS((
                    303:        HTFormat        format_in,
                    304:        HTFormat        format_out,
                    305:        HTParentAnchor  *anchor,
                    306:        FILE            *fp,
2.1       timbl     307:        HTStream*       sink));
2.8       timbl     308: 
                    309: </PRE>
                    310: <H2>HTFormatInit: Set up default presentations
                    311: and conversions</H2>These are defined in HTInit.c or
                    312: HTSInit.c if these have been replaced.
                    313: If you don't call this routine, and
                    314: you don't define any presentations,
                    315: then this routine will automatically
                    316: be called the first time a conversion
                    317: is needed. However, if you explicitly
                    318: add some conversions (eg using HTLoadRules)
                    319: then you may want also to explicitly
                    320: call this to get the defaults as
                    321: well.
                    322: <PRE>extern void HTFormatInit NOPARAMS;
2.1       timbl     323: 
                    324: </PRE>
                    325: <H2>Epilogue</H2>
                    326: <PRE>extern BOOL HTOutputSource;       /* Flag: shortcut parser */
                    327: #endif
                    328: 
2.7       timbl     329: </PRE>end</A></BODY>

Webmaster