Annotation of libwww/Library/src/HTML.html, revision 2.22.2.1
2.7 timbl 1: <HTML>
2: <HEAD>
3: <TITLE>HTML to rich text converter for libwww</TITLE>
4: </HEAD>
2.6 timbl 5: <BODY>
2.20 frystyk 6:
7: <H1>The HTML to styled text object converter</H1>
8:
9: <PRE>
10: /*
11: ** (c) COPYRIGHT CERN 1994.
12: ** Please first read the full copyright statement in the file COPYRIGH.
13: */
14: </PRE>
15:
16: This interprets the <A
2.22.2.1! frystyk 17: HREF="http://www.w3.org/hypertext/WWW/MarkUp/MarkUp.html">HTML</A>
2.20 frystyk 18: semantics and some HTMLPlus.<P>
19:
20: This module is implemented by <A HREF="HTML.c">HTML.c</A>, and it is
21: a part of the <A
2.22.2.1! frystyk 22: HREF="http://www.w3.org/hypertext/WWW/Library/User/Guide/Guide.html">
2.20 frystyk 23: Library of Common Code</A>.
24:
25: <PRE>
26: #ifndef HTML_H
2.1 timbl 27: #define HTML_H
28:
2.13 luotonen 29: #include "HTFormat.h"
2.1 timbl 30: #include "HTAnchor.h"
2.8 timbl 31: #include "HTMLPDTD.h"
2.1 timbl 32:
2.8 timbl 33: #define DTD HTMLP_dtd
2.1 timbl 34:
35: extern CONST HTStructuredClass HTMLPresentation;
2.17 frystyk 36: </PRE>
37:
38: <H2>HTML_new: A structured stream to parse HTML</H2>
2.1 timbl 39:
2.17 frystyk 40: When this routine is called, the request structure may contain a <A
41: NAME="z4" HREF="HTAccess.html#z6">childAnchor</A> value. In that case
42: it is the responsability of this module to select the anchor.<P>
2.12 timbl 43:
2.9 timbl 44: <PRE>extern HTStructured* HTML_new PARAMS((HTRequest * request,
2.8 timbl 45: void * param,
46: HTFormat input_format,
47: HTFormat output_format,
48: HTStream * output_stream));
49:
2.10 luotonen 50: </PRE>
2.1 timbl 51:
2.17 frystyk 52: <H3>Reopen</H3>
53:
54: Reopening an existing HTML object allows it to be retained (for
55: example by the styled text object) after the structured stream has
56: been closed. To be actually deleted, the HTML object must be closed
57: once more times than it has been reopened.
58:
59: <PRE>
60: extern void HTML_reopen PARAMS((HTStructured * me));
2.10 luotonen 61: </PRE>
2.17 frystyk 62:
2.10 luotonen 63: <H2>Converters</H2>
2.8 timbl 64:
2.17 frystyk 65: These are the converters implemented in this module:
2.8 timbl 66:
2.17 frystyk 67: <PRE>
2.22 frystyk 68: #ifndef pyramid
2.17 frystyk 69: extern HTConverter HTMLToPlain, HTMLToC, HTMLPresent, HTMLToTeX;
70: #endif
2.8 timbl 71: </PRE>
2.17 frystyk 72:
2.8 timbl 73: <H2>Selecting internal character set
74: representations</H2>
75: <PRE>typedef enum _HTMLCharacterSet {
2.1 timbl 76: HTML_ISO_LATIN1,
77: HTML_NEXT_CHARS,
78: HTML_PC_CP950
79: } HTMLCharacterSet;
80:
81: extern void HTMLUseCharacterSet PARAMS((HTMLCharacterSet i));
82:
2.6 timbl 83: </PRE>
84: <H2>Record error message as a hypertext
85: object</H2>The error message should be marked
86: as an error so that it can be reloaded
87: later. This implementation just throws
88: up an error message and leaves the
89: document unloaded.
2.10 luotonen 90: <H3>On entry,</H3>
91: <DL>
92: <DT>sink
2.11 timbl 93: <DD> is a stream to the output device
2.10 luotonen 94: if any
95: <DT>number
2.11 timbl 96: <DD> is the HTTP error number
2.10 luotonen 97: <DT>message
2.11 timbl 98: <DD> is the human readable message.
2.10 luotonen 99: </DL>
100:
101: <H3>On exit,</H3>a return code like HT_LOADED if object
102: exists else < 0
2.19 frystyk 103: <PRE>extern int HTLoadError PARAMS((
2.14 luotonen 104: HTRequest * req,
2.1 timbl 105: int number,
106: CONST char * message));
107:
2.6 timbl 108:
2.16 timbl 109: </PRE>
2.22 frystyk 110:
111: <H2>White Space Treatment</H2>
112:
113: There is a small number of different ways of treating white space in
114: SGML, in mapping from a text object to HTML. These have to be
115: programmed it seems.
116:
117: <PRE>
2.16 timbl 118: /*
119: In text object \n\n \n tab \n\n\t
120: -------------- ------------- ----- ----- -------
121: in Address,
122: Blockquote,
2.22 frystyk 123: Normal, <P> <BR> - NORMAL
124: H1-6: close+open <BR> - HEADING
125: Glossary <DT> <DT> <DD> <P> GLOSSARY
2.16 timbl 126: List,
2.22 frystyk 127: Menu <LI> <LI> - <P> LIST
128: Dir <LI> <LI> <LI> DIR
2.16 timbl 129: Pre etc \n\n \n \t PRE
2.7 timbl 130:
2.16 timbl 131: */
132:
133: typedef enum _white_space_treatment {
134: WS_NORMAL,
135: WS_HEADING,
136: WS_GLOSSARY,
137: WS_LIST,
138: WS_DIR,
139: WS_PRE
140: } white_space_treatment;
141:
142: </pre>
2.22 frystyk 143:
2.16 timbl 144: <h2>Nesting State</h2>
145: These elements form tree with an item for each nesting state: that
146: is, each unique combination of nested elements which has a
147: specific style.
148: <pre>
149: typedef struct _HTNesting {
150: void * style; /* HTStyle *: Platform dependent */
151: white_space_treatment wst;
152: struct _HTNesting * parent;
153: int element_number;
154: int item_number; /* only for ordered lists */
155: int list_level; /* how deep nested */
156: HTList * children;
157: BOOL paragraph_break;
158: int magic;
159: BOOL object_gens_HTML; /* we don't generate HTML */
160: } HTNesting;
161:
162:
163: </pre>
164: <H2>Nesting functions</H2>
165: These functions were new with HTML2.c. They allow the tree
166: of SGML nesting states to be manipulated, and SGML regenerated from the
167: style sequence.
168: <PRE>
169:
170: extern void HTRegenInit NOPARAMS;
171:
172: extern void HTRegenCharacter PARAMS((
173: char c,
174: HTNesting * nesting,
175: HTStructured * target));
176:
177: extern void HTNestingChange PARAMS((
178: HTStructured* s,
179: HTNesting* old,
2.18 frystyk 180: HTNesting * newnest,
2.16 timbl 181: HTChildAnchor * info,
182: CONST char * aName));
183:
184: extern HTNesting * HTMLCommonality PARAMS((
185: HTNesting * s1,
186: HTNesting * s2));
187:
188: extern HTNesting * HTNestElement PARAMS((HTNesting * p, int ele));
189: extern /* HTStyle * */ void * HTStyleForNesting PARAMS((HTNesting * n));
190:
191: extern HTNesting* HTMLAncestor PARAMS((HTNesting * old, int depth));
2.18 frystyk 192:
193: extern HTNesting* CopyBranch PARAMS((HTNesting * old, HTNesting * newnest,
194: int depth));
195:
2.16 timbl 196: extern HTNesting * HTInsertLevel PARAMS((HTNesting * old,
197: int element_number,
198: int level));
199: extern HTNesting * HTDeleteLevel PARAMS((HTNesting * old,
200: int level));
201: extern int HTMLElementNumber PARAMS((HTNesting * s));
202: extern int HTMLLevel PARAMS(( HTNesting * s));
203: extern HTNesting* HTMLAncestor PARAMS((HTNesting * old, int depth));
204:
205: #endif /* end HTML_H */
206:
207: </PRE>
208:
209: end</BODY>
2.7 timbl 210: </HTML>
Webmaster