<HTML>
<HEAD>
<TITLE>HTML to rich text Converter</TITLE>
<!-- Changed by: Henrik Frystyk Nielsen, 19-Nov-1995 -->
</HEAD>
<BODY>
<H1>The HTML to styled text object converter</H1>
<PRE>
/*
** (c) COPYRIGHT MIT 1995.
** Please first read the full copyright statement in the file COPYRIGH.
*/
</PRE>
This interprets the <A
HREF="http://www.w3.org/pub/WWW/MarkUp/MarkUp.html">HTML</A>
semantics and some HTMLPlus.<P>
This module is implemented by <A HREF="HTML.c">HTML.c</A>, and it is
a part of the <A
HREF="http://www.w3.org/pub/WWW/Library/">
W3C Reference Library</A>.
<PRE>
#ifndef HTML_H
#define HTML_H
#include "HTFormat.h"
#include "HTAnchor.h"
#include "HTMLPDTD.h"
</PRE>
<H2>Reopen an Existing HTML object</H2>
Reopening an existing HTML object allows it to be retained (for
example by the styled text object) after the structured stream has
been closed. To be actually deleted, the HTML object must be closed
once more times than it has been reopened.
<PRE>
extern void HTML_reopen (HTStructured * me);
</PRE>
<H2>Converters</H2>
These are the converters implemented in this module:
<PRE>
extern HTConverter HTMLToPlain, HTMLToC, HTMLPresent;
</PRE>
<H2>Selecting internal character set representations</H2>
<PRE>
typedef enum _HTMLCharacterSet {
HTML_ISO_LATIN1,
HTML_NEXT_CHARS,
HTML_PC_CP950
} HTMLCharacterSet;
extern void HTMLUseCharacterSet (HTMLCharacterSet i);
</PRE>
<H2>White Space Treatment</H2>
There is a small number of different ways of treating white space in
SGML, in mapping from a text object to HTML. These have to be
programmed it seems.
<PRE>
/*
In text object \n\n \n tab \n\n\t
-------------- ------------- ----- ----- -------
in Address,
Blockquote,
Normal, <P> <BR> - NORMAL
H1-6: close+open <BR> - HEADING
Glossary <DT> <DT> <DD> <P> GLOSSARY
List,
Menu <LI> <LI> - <P> LIST
Dir <LI> <LI> <LI> DIR
Pre etc \n\n \n \t PRE
*/
typedef enum _white_space_treatment {
WS_NORMAL,
WS_HEADING,
WS_GLOSSARY,
WS_LIST,
WS_DIR,
WS_PRE
} white_space_treatment;
</pre>
<h2>Nesting State</h2>
These elements form tree with an item for each nesting state: that
is, each unique combination of nested elements which has a
specific style.
<pre>
typedef struct _HTNesting {
void * style; /* HTStyle *: Platform dependent */
white_space_treatment wst;
struct _HTNesting * parent;
int element_number;
int item_number; /* only for ordered lists */
int list_level; /* how deep nested */
HTList * children;
BOOL paragraph_break;
int magic;
BOOL object_gens_HTML; /* we don't generate HTML */
} HTNesting;
</pre>
<H2>Nesting functions</H2>
These functions were new with HTML2.c. They allow the tree
of SGML nesting states to be manipulated, and SGML regenerated from the
style sequence.
<PRE>
extern void HTRegenInit (void);
extern void HTRegenCharacter (
char c,
HTNesting * nesting,
HTStructured * target);
extern void HTNestingChange (
HTStructured* s,
HTNesting* old,
HTNesting * newnest,
HTChildAnchor * info,
CONST char * aName);
extern HTNesting * HTMLCommonality (
HTNesting * s1,
HTNesting * s2);
extern HTNesting * HTNestElement (HTNesting * p, int ele);
extern /* HTStyle * */ void * HTStyleForNesting (HTNesting * n);
extern HTNesting* HTMLAncestor (HTNesting * old, int depth);
extern HTNesting* CopyBranch (HTNesting * old, HTNesting * newnest,
int depth);
extern HTNesting * HTInsertLevel (HTNesting * old,
int element_number,
int level);
extern HTNesting * HTDeleteLevel (HTNesting * old,
int level);
extern int HTMLElementNumber (HTNesting * s);
extern int HTMLLevel ( HTNesting * s);
extern HTNesting* HTMLAncestor (HTNesting * old, int depth);
#endif /* end HTML_H */
</PRE>
end</BODY>
</HTML>
Webmaster