Presentation Modules
The HTML parser has three different levels of APIs in order to make the
implementation as flexible as possible. Depending on which API is used by the
application, the output can be a stream, a structured stream or a set of
callback functions as indicated in the figure below:
The default HTML parser in libwww is very simple. You can look at the Amaya browser/editor for a complete structured
parser.
- SGML Stream Interface
- This interface provides the most basic API consisting of the output
from a stream without any form for structure imposed on the data. The
internal SGML parser parses the data
sequence, identifies SGML markup tags, and passes the information on the
the HTML parser. However, if the
application has its own SGML parser and HTML parser, the internal
parsers can be disabled by removing the internal HTML converter called
HTMLPresent()
used to present a graphic object on the
screen from both the global and the local list of converters and
presenters.
- HTML Structured Stream Interface
- If the application has its own HTML
parser that understands the structured output from the internal SGML
parser then the second API can be used. The current HTML parser in
libwww is very basic and does not understand many of the new features in
HTML 2 and 3.
- HText Call Back Interface
- The last API can be in case the application prefers to use the
internal HTML parser and only wants to provide a platform dependent
definition of the callback functions defined in the HText module which are all defined in HText module.
Due to the limited functionality of the internal HTML parsing module, many
applications have chosen to implement their own HTML parser.
Registrering the HTML Parser
Henrik Frystyk Nielsen,
@(#) $Id: HTML.html,v 1.13 1999/08/05 12:21:12 kahan Exp $