Annotation of libwww/Library/src/HTML.html, revision 2.20.2.2
2.7 timbl 1: <HTML>
2: <HEAD>
3: <TITLE>HTML to rich text converter for libwww</TITLE>
4: </HEAD>
2.6 timbl 5: <BODY>
2.20 frystyk 6:
7: <H1>The HTML to styled text object converter</H1>
8:
9: <PRE>
10: /*
11: ** (c) COPYRIGHT CERN 1994.
12: ** Please first read the full copyright statement in the file COPYRIGH.
13: */
14: </PRE>
15:
16: This interprets the <A
17: HREF="http://info.cern.ch/hypertext/WWW/MarkUp/MarkUp.html">HTML</A>
18: semantics and some HTMLPlus.<P>
19:
20: This module is implemented by <A HREF="HTML.c">HTML.c</A>, and it is
21: a part of the <A
22: HREF="http://info.cern.ch/hypertext/WWW/Library/User/Guide/Guide.html">
23: Library of Common Code</A>.
24:
25: <PRE>
26: #ifndef HTML_H
2.1 timbl 27: #define HTML_H
28:
2.13 luotonen 29: #include "HTFormat.h"
2.1 timbl 30: #include "HTAnchor.h"
2.8 timbl 31: #include "HTMLPDTD.h"
2.1 timbl 32:
2.8 timbl 33: #define DTD HTMLP_dtd
34:
2.6 timbl 35: #ifdef SHORT_NAMES
36: #define HTMLPresentation HTMLPren
37: #define HTMLPresent HTMLPres
38: #endif
2.1 timbl 39:
40: extern CONST HTStructuredClass HTMLPresentation;
2.17 frystyk 41: </PRE>
42:
43: <H2>HTML_new: A structured stream to parse HTML</H2>
2.1 timbl 44:
2.17 frystyk 45: When this routine is called, the request structure may contain a <A
46: NAME="z4" HREF="HTAccess.html#z6">childAnchor</A> value. In that case
47: it is the responsability of this module to select the anchor.<P>
2.12 timbl 48:
2.9 timbl 49: <PRE>extern HTStructured* HTML_new PARAMS((HTRequest * request,
2.8 timbl 50: void * param,
51: HTFormat input_format,
52: HTFormat output_format,
53: HTStream * output_stream));
54:
2.10 luotonen 55: </PRE>
2.1 timbl 56:
2.17 frystyk 57: <H3>Reopen</H3>
58:
59: Reopening an existing HTML object allows it to be retained (for
60: example by the styled text object) after the structured stream has
61: been closed. To be actually deleted, the HTML object must be closed
62: once more times than it has been reopened.
63:
64: <PRE>
65: extern void HTML_reopen PARAMS((HTStructured * me));
2.10 luotonen 66: </PRE>
2.17 frystyk 67:
2.10 luotonen 68: <H2>Converters</H2>
2.8 timbl 69:
2.17 frystyk 70: These are the converters implemented in this module:
2.8 timbl 71:
2.17 frystyk 72: <PRE>
73: #ifndef pyramid
74: extern HTConverter HTMLToPlain, HTMLToC, HTMLPresent, HTMLToTeX;
75: #endif
2.8 timbl 76: </PRE>
2.17 frystyk 77:
2.8 timbl 78: <H2>Selecting internal character set
79: representations</H2>
80: <PRE>typedef enum _HTMLCharacterSet {
2.1 timbl 81: HTML_ISO_LATIN1,
82: HTML_NEXT_CHARS,
83: HTML_PC_CP950
84: } HTMLCharacterSet;
85:
86: extern void HTMLUseCharacterSet PARAMS((HTMLCharacterSet i));
87:
2.6 timbl 88: </PRE>
89: <H2>Record error message as a hypertext
90: object</H2>The error message should be marked
91: as an error so that it can be reloaded
92: later. This implementation just throws
93: up an error message and leaves the
94: document unloaded.
2.10 luotonen 95: <H3>On entry,</H3>
96: <DL>
97: <DT>sink
2.11 timbl 98: <DD> is a stream to the output device
2.10 luotonen 99: if any
100: <DT>number
2.11 timbl 101: <DD> is the HTTP error number
2.10 luotonen 102: <DT>message
2.11 timbl 103: <DD> is the human readable message.
2.10 luotonen 104: </DL>
105:
106: <H3>On exit,</H3>a return code like HT_LOADED if object
107: exists else < 0
2.19 frystyk 108: <PRE>extern int HTLoadError PARAMS((
2.14 luotonen 109: HTRequest * req,
2.1 timbl 110: int number,
111: CONST char * message));
112:
2.6 timbl 113:
2.16 timbl 114: </PRE>
2.20.2.1 frystyk 115:
116: <H2>White Space Treatment</H2>
117:
118: There is a small number of different ways of treating white space in
119: SGML, in mapping from a text object to HTML. These have to be
120: programmed it seems.
121:
122: <PRE>
2.16 timbl 123: /*
124: In text object \n\n \n tab \n\n\t
125: -------------- ------------- ----- ----- -------
126: in Address,
127: Blockquote,
2.20.2.1 frystyk 128: Normal, <P> <BR> - NORMAL
129: H1-6: close+open <BR> - HEADING
130: Glossary <DT> <DT> <DD> <P> GLOSSARY
2.16 timbl 131: List,
2.20.2.1 frystyk 132: Menu <LI> <LI> - <P> LIST
133: Dir <LI> <LI> <LI> DIR
2.16 timbl 134: Pre etc \n\n \n \t PRE
2.7 timbl 135:
2.16 timbl 136: */
137:
138: typedef enum _white_space_treatment {
139: WS_NORMAL,
140: WS_HEADING,
141: WS_GLOSSARY,
142: WS_LIST,
143: WS_DIR,
144: WS_PRE
145: } white_space_treatment;
146:
147: </pre>
2.20.2.1 frystyk 148:
2.16 timbl 149: <h2>Nesting State</h2>
150: These elements form tree with an item for each nesting state: that
151: is, each unique combination of nested elements which has a
152: specific style.
153: <pre>
154: typedef struct _HTNesting {
155: void * style; /* HTStyle *: Platform dependent */
156: white_space_treatment wst;
157: struct _HTNesting * parent;
158: int element_number;
159: int item_number; /* only for ordered lists */
160: int list_level; /* how deep nested */
161: HTList * children;
162: BOOL paragraph_break;
163: int magic;
164: BOOL object_gens_HTML; /* we don't generate HTML */
165: } HTNesting;
166:
167:
168: </pre>
169: <H2>Nesting functions</H2>
170: These functions were new with HTML2.c. They allow the tree
171: of SGML nesting states to be manipulated, and SGML regenerated from the
172: style sequence.
173: <PRE>
174:
175: extern void HTRegenInit NOPARAMS;
176:
177: extern void HTRegenCharacter PARAMS((
178: char c,
179: HTNesting * nesting,
180: HTStructured * target));
181:
182: extern void HTNestingChange PARAMS((
183: HTStructured* s,
184: HTNesting* old,
2.18 frystyk 185: HTNesting * newnest,
2.16 timbl 186: HTChildAnchor * info,
187: CONST char * aName));
188:
189: extern HTNesting * HTMLCommonality PARAMS((
190: HTNesting * s1,
191: HTNesting * s2));
192:
193: extern HTNesting * HTNestElement PARAMS((HTNesting * p, int ele));
194: extern /* HTStyle * */ void * HTStyleForNesting PARAMS((HTNesting * n));
195:
196: extern HTNesting* HTMLAncestor PARAMS((HTNesting * old, int depth));
2.18 frystyk 197:
198: extern HTNesting* CopyBranch PARAMS((HTNesting * old, HTNesting * newnest,
199: int depth));
200:
2.16 timbl 201: extern HTNesting * HTInsertLevel PARAMS((HTNesting * old,
202: int element_number,
203: int level));
204: extern HTNesting * HTDeleteLevel PARAMS((HTNesting * old,
205: int level));
206: extern int HTMLElementNumber PARAMS((HTNesting * s));
207: extern int HTMLLevel PARAMS(( HTNesting * s));
208: extern HTNesting* HTMLAncestor PARAMS((HTNesting * old, int depth));
209:
210: #endif /* end HTML_H */
211:
212: </PRE>
213:
214: end</BODY>
2.7 timbl 215: </HTML>
Webmaster