Annotation of libwww/Library/src/HTML.html, revision 2.26
2.7 timbl 1: <HTML>
2: <HEAD>
2.26 ! frystyk 3: <TITLE>HTML to rich text Converter</TITLE>
! 4: <!-- Changed by: Henrik Frystyk Nielsen, 14-Aug-1995 -->
2.7 timbl 5: </HEAD>
2.6 timbl 6: <BODY>
2.20 frystyk 7:
8: <H1>The HTML to styled text object converter</H1>
9:
10: <PRE>
11: /*
2.24 frystyk 12: ** (c) COPYRIGHT MIT 1995.
2.20 frystyk 13: ** Please first read the full copyright statement in the file COPYRIGH.
14: */
15: </PRE>
16:
17: This interprets the <A
2.23 frystyk 18: HREF="http://www.w3.org/hypertext/WWW/MarkUp/MarkUp.html">HTML</A>
2.20 frystyk 19: semantics and some HTMLPlus.<P>
20:
21: This module is implemented by <A HREF="HTML.c">HTML.c</A>, and it is
22: a part of the <A
2.25 frystyk 23: HREF="http://www.w3.org/hypertext/WWW/Library/">
24: W3C Reference Library</A>.
2.20 frystyk 25:
26: <PRE>
27: #ifndef HTML_H
2.1 timbl 28: #define HTML_H
29:
2.13 luotonen 30: #include "HTFormat.h"
2.1 timbl 31: #include "HTAnchor.h"
2.8 timbl 32: #include "HTMLPDTD.h"
2.1 timbl 33:
2.8 timbl 34: #define DTD HTMLP_dtd
2.1 timbl 35:
36: extern CONST HTStructuredClass HTMLPresentation;
2.17 frystyk 37: </PRE>
38:
39: <H2>HTML_new: A structured stream to parse HTML</H2>
2.1 timbl 40:
2.17 frystyk 41: When this routine is called, the request structure may contain a <A
42: NAME="z4" HREF="HTAccess.html#z6">childAnchor</A> value. In that case
43: it is the responsability of this module to select the anchor.<P>
2.12 timbl 44:
2.9 timbl 45: <PRE>extern HTStructured* HTML_new PARAMS((HTRequest * request,
2.8 timbl 46: void * param,
47: HTFormat input_format,
48: HTFormat output_format,
49: HTStream * output_stream));
50:
2.10 luotonen 51: </PRE>
2.1 timbl 52:
2.17 frystyk 53: <H3>Reopen</H3>
54:
55: Reopening an existing HTML object allows it to be retained (for
56: example by the styled text object) after the structured stream has
57: been closed. To be actually deleted, the HTML object must be closed
58: once more times than it has been reopened.
59:
60: <PRE>
61: extern void HTML_reopen PARAMS((HTStructured * me));
2.10 luotonen 62: </PRE>
2.17 frystyk 63:
2.10 luotonen 64: <H2>Converters</H2>
2.8 timbl 65:
2.17 frystyk 66: These are the converters implemented in this module:
2.8 timbl 67:
2.17 frystyk 68: <PRE>
2.22 frystyk 69: #ifndef pyramid
2.17 frystyk 70: extern HTConverter HTMLToPlain, HTMLToC, HTMLPresent, HTMLToTeX;
71: #endif
2.8 timbl 72: </PRE>
2.17 frystyk 73:
2.8 timbl 74: <H2>Selecting internal character set
75: representations</H2>
76: <PRE>typedef enum _HTMLCharacterSet {
2.1 timbl 77: HTML_ISO_LATIN1,
78: HTML_NEXT_CHARS,
79: HTML_PC_CP950
80: } HTMLCharacterSet;
81:
82: extern void HTMLUseCharacterSet PARAMS((HTMLCharacterSet i));
83:
2.6 timbl 84: </PRE>
85: <H2>Record error message as a hypertext
86: object</H2>The error message should be marked
87: as an error so that it can be reloaded
88: later. This implementation just throws
89: up an error message and leaves the
90: document unloaded.
2.10 luotonen 91: <H3>On entry,</H3>
92: <DL>
93: <DT>sink
2.11 timbl 94: <DD> is a stream to the output device
2.10 luotonen 95: if any
96: <DT>number
2.11 timbl 97: <DD> is the HTTP error number
2.10 luotonen 98: <DT>message
2.11 timbl 99: <DD> is the human readable message.
2.10 luotonen 100: </DL>
101:
102: <H3>On exit,</H3>a return code like HT_LOADED if object
103: exists else < 0
2.19 frystyk 104: <PRE>extern int HTLoadError PARAMS((
2.14 luotonen 105: HTRequest * req,
2.1 timbl 106: int number,
107: CONST char * message));
108:
2.6 timbl 109:
2.16 timbl 110: </PRE>
2.22 frystyk 111:
112: <H2>White Space Treatment</H2>
113:
114: There is a small number of different ways of treating white space in
115: SGML, in mapping from a text object to HTML. These have to be
116: programmed it seems.
117:
118: <PRE>
2.16 timbl 119: /*
120: In text object \n\n \n tab \n\n\t
121: -------------- ------------- ----- ----- -------
122: in Address,
123: Blockquote,
2.22 frystyk 124: Normal, <P> <BR> - NORMAL
125: H1-6: close+open <BR> - HEADING
126: Glossary <DT> <DT> <DD> <P> GLOSSARY
2.16 timbl 127: List,
2.22 frystyk 128: Menu <LI> <LI> - <P> LIST
129: Dir <LI> <LI> <LI> DIR
2.16 timbl 130: Pre etc \n\n \n \t PRE
2.7 timbl 131:
2.16 timbl 132: */
133:
134: typedef enum _white_space_treatment {
135: WS_NORMAL,
136: WS_HEADING,
137: WS_GLOSSARY,
138: WS_LIST,
139: WS_DIR,
140: WS_PRE
141: } white_space_treatment;
142:
143: </pre>
2.22 frystyk 144:
2.16 timbl 145: <h2>Nesting State</h2>
146: These elements form tree with an item for each nesting state: that
147: is, each unique combination of nested elements which has a
148: specific style.
149: <pre>
150: typedef struct _HTNesting {
151: void * style; /* HTStyle *: Platform dependent */
152: white_space_treatment wst;
153: struct _HTNesting * parent;
154: int element_number;
155: int item_number; /* only for ordered lists */
156: int list_level; /* how deep nested */
157: HTList * children;
158: BOOL paragraph_break;
159: int magic;
160: BOOL object_gens_HTML; /* we don't generate HTML */
161: } HTNesting;
162:
163:
164: </pre>
165: <H2>Nesting functions</H2>
166: These functions were new with HTML2.c. They allow the tree
167: of SGML nesting states to be manipulated, and SGML regenerated from the
168: style sequence.
169: <PRE>
170:
171: extern void HTRegenInit NOPARAMS;
172:
173: extern void HTRegenCharacter PARAMS((
174: char c,
175: HTNesting * nesting,
176: HTStructured * target));
177:
178: extern void HTNestingChange PARAMS((
179: HTStructured* s,
180: HTNesting* old,
2.18 frystyk 181: HTNesting * newnest,
2.16 timbl 182: HTChildAnchor * info,
183: CONST char * aName));
184:
185: extern HTNesting * HTMLCommonality PARAMS((
186: HTNesting * s1,
187: HTNesting * s2));
188:
189: extern HTNesting * HTNestElement PARAMS((HTNesting * p, int ele));
190: extern /* HTStyle * */ void * HTStyleForNesting PARAMS((HTNesting * n));
191:
192: extern HTNesting* HTMLAncestor PARAMS((HTNesting * old, int depth));
2.18 frystyk 193:
194: extern HTNesting* CopyBranch PARAMS((HTNesting * old, HTNesting * newnest,
195: int depth));
196:
2.16 timbl 197: extern HTNesting * HTInsertLevel PARAMS((HTNesting * old,
198: int element_number,
199: int level));
200: extern HTNesting * HTDeleteLevel PARAMS((HTNesting * old,
201: int level));
202: extern int HTMLElementNumber PARAMS((HTNesting * s));
203: extern int HTMLLevel PARAMS(( HTNesting * s));
204: extern HTNesting* HTMLAncestor PARAMS((HTNesting * old, int depth));
205:
206: #endif /* end HTML_H */
207:
208: </PRE>
209:
210: end</BODY>
2.7 timbl 211: </HTML>
Webmaster