This package contains Ælfred2, which includes an enhanced SAX2-compatible version of the Ælfred non-validating XML parser and a modular (and hence optional) DTD validating parser. Use them like any other SAX2 parsers.

Some of the documentation below was modified from the original Ælfred README.txt file. All of it has been updated.

About Ælfred

Ælfred is a Java-based XML parser originally from Microstar Software Limited (no longer in existence) and more or less placed into the public domain.

Design Principles

In most Java applets and applications, XML should not be the central feature; instead, XML is the means to another end, such as loading configuration information, reading meta-data, or parsing transactions.

When an XML parser is only a single component of a much larger program, it cannot be large, slow, or resource-intensive. With Java applets, in particular, code size is a significant issue. The standard modem is still not operating at 56 Kbaud, or sometimes even with data compression. Assuming an uncompressed 28.8 Kbaud modem, only about 3 KBytes can be downloaded in one second; compression often doubles that speed, but a V.90 modem may not provide another doubling. When used with embedded processors, similar size concerns apply.

Ælfred is designed for easy and efficient use over the Internet, based on the following principles:

  1. Ælfred must be as small as possible, so that it doesn't add too much to an applet's download time.
  2. Ælfred must use as few class files as possible, to minimize the number of HTTP connections necessary. (The use of JAR files has made this be less of a concern.)
  3. Ælfred must be compatible with most or all Java implementations and platforms. (Write once, run anywhere.)
  4. Ælfred must use as little memory as possible, so that it does not take away resources from the rest of your program. (It doesn't force you to use DOM or a similar costly data structure API.)
  5. Ælfred must run as fast as possible, so that it does not slow down the rest of your program.
  6. Ælfred must produce correct output for well-formed and valid documents, but need not reject every document that is not valid or not well-formed. (In Ælfred2, correctness was a bigger concern than in the original version; and a validation option is available.)
  7. Ælfred must provide full internationalization from the first release. (Ælfred2 now automatically handles all encodings supported by the underlying JVM; previous versions handled only UTF-8, UTF_16, ASCII, and ISO-8859-1.)

As you can see from this list, Ælfred is designed for production use, but neither validation nor perfect conformance was a requirement. Good validating parsers exist, including one in this package, and you should use them as appropriate. (See conformance reviews available at http://www.xml.com)

One of the main goals of Ælfred2 was to significantly improve conformance, while not significantly affecting the other goals stated above. Since the primary use of this parser is with SAX, some classes could be removed, and so the overall size of Ælfred was actually reduced. Subsequent performance work produced a notable speedup (over twenty percent on larger files). That is, the tradeoffs between speed, size, and conformance were re-targeted towards conformance and support of newer APIs (SAX2), with a a positive performance impact.

The role anticipated for this version of Ælfred is as a lightweight Open Source SAX parser that can be used in essentially every Java program where the handful of conformance violations (noted below) are acceptable. That certainly includes applets, and nowadays one must also mention embedded systems as being even more size-critical. At this writing, all parsers that are more conformant are significantly larger, even when counting the optional validation support in this version of Ælfred.

About the Name Ælfred

Ælfred the Great (AElfred in ASCII) was King of Wessex, and some say of King of England, at the time of his death in 899 AD. Ælfred introduced a wide-spread literacy program in the hope that his people would learn to read English, at least, if Latin was too difficult for them. This Ælfred hopes to bring another sort of literacy to Java, using XML, at least, if full SGML is too difficult.

The initial Æ ligature ("AE)" is also a reminder that XML is not limited to ASCII.

Character Encodings

The Ælfred parser currently builds in support for a handful of input encodings. Of course these include UTF-8 and UTF-16, which all XML parsers are required to support:

If you use any encoding other than UTF-8 or UTF-16 you should make sure to label your data appropriately:

<?xml version="1.0" encoding="ISO-8859-1"?>

Encodings accessed through java.io.InputStreamReader are now fully supported for both external labels (such as MIME types) and internal types (as shown above).