TODO for the XML parser:
- Support for UTF-8 and UTF-16 encoding (Urgent !!!).
=> added some convertion routines provided by Martin Durst but I didn't
try to glue them in. I plan to keep everything internally as UTF-8
this is slightly more costly but more compact, and recent processors
efficiency is cache related. The key for good performances is keeping
the data set small, so will I.
- progressive parsing. The entity support is a first step toward
asbtraction of an input stream. A large part of the context is still
located on the stack, moving to a state machine and putting everyting
in the parsing context should provide an adequate solution.
=> Rather than progressive parsing, give more power to the SAX-like
interface. Currently the DOM-like representation is built but
it should be possible to define that only as a set of SAX callbacks
and remove the tree creation from the parser code.
- DOM support, instead of using a proprietary in memory
format for the document representation, the parser should
call a DOM API to actually build the resulting document.
Then the parser becomes independent of the in-memory
representation of the document. Even better using RPC's
the parser can actually build the document in another
program.
=> Work started, now the internal representation is by default
very near a direct DOM implementation. The DOM glue is implemented
as a separate module. See gnome-dom !
Done:
- C++ support : John Ehresman <jehresma@dsg.harvard.edu>
- Updated code to follow more recent specs, added compatibility flag
- Better error handling, use a dedicated, overridable error
handling function.
- Support for CDATA.
- Keep track of line numbers for better error reporting.
- Support for PI (SAX one).
- Support for Comments (bad, should be in ASAP, they are parsed
but not stored), should be configurable.
- Improve the support of entities on save (+SAX).
$Id: TODO,v 1.8 1998/11/16 02:36:31 daniel Exp $
Webmaster