This package provides the core SAX1 and SAX2 APIs. SAX1 APIs are deprecated to encourage integration of namespace-awareness into designs of new applications and maintainance of existing infrastructure. For more information on SAX, see the SAX page.
One of the essential characteristics of SAX2 is that it added
feature flags that can be used to examine and perhaps modify
parser modes, in particular modes such as validation.
Since features are identified by (absolute) URIs, anyone
can define such features.
Currently defined standard feature URIs have the prefix
http://xml.org/sax/features/
before an identifier such as
validation
. One could turn that feature on or off using
setFeature. Those standard identifiers are:
Feature ID | Default | Description |
---|---|---|
external-general-entities | unspecified | Reports whether this parser processes external general entities; always true if validating |
external-parameter-entities | unspecified | Reports whether this parser processes external parameter entities; always true if validating |
namespaces | true | true indicates namespace URIs and unprefixed local names for element and attribute names will be available |
namespace-prefixes | false | true indicates XML 1.0 names (with prefixes) and attributes (including xmlns* attributes) will be available |
string-interning | unspecified | true if all XML names provided will have been interned using java.lang.String.intern, supporting fast testing of equality/inequality with string constants |
validation | unspecified (false?) | controls whether the parser is validating its input |
For parser interface characteristics that aren't boolean
flags, a separate namespace for objects is defined. The
objects in this namespace are again identified by URI, and
the standard property URIs have the prefix
http://xml.org/sax/properties/
before an identifier such as
lexical-handler
or
dom-node
. One could manage those properties using
setProperty(). Those identifiers are:
Property ID | Description |
---|---|
declaration-handler | Used to see most DTD declarations except those treated as lexical ("root element name is ...") or which are mandatory for all SAX parsers (DTDHandler). The Object must implement DeclHandler. |
dom-node | For "DOM Walker" style parsers, which ignore their parser.parse() parameters, this is used to specify the DOM (sub)tree being walked by the parser. |
lexical-handler | Used to see some syntax events that are essential in some applications: comments, CDATA delimeters, selected general entity inclusions, and the start and end of the DTD. The Object must implement LexicalHandler. |
xml-string | Readable only during a parser callback, this exposes a TBS chunk of characters responsible for the current event. |
Namespace support is the root of the incompatibilities between SAX1 and SAX2. Namespaces are not part of XML itself, but are felt to be sufficiently strategic that this layer has been integrated into the SAX2 API. Existing applications needs to change two method signatures, and add two new handler methods, to handle SAX2 namespace features.
Names in the ContentHandler startElement and endElement callback must now, by default, conform to the XML + Namespaces specification. You will get two callbacks describing the beginning and end of a scope for a given name prefix, and may also see the attributes which tell you that same thing. (As a rule, ignore those callbacks.)
If you explicitly enable the namespace-prefixes feature on a parser, you can see the xmlns* attributes, and the element and attribute names as found in the document (with prefixes). If your SAX1 application is currently namespace aware, or needs to work with data that may use prefixed (or perhaps hierarchical) names, you must enable that feature in order to continue working correctly with a SAX2 parser, in addition to making the various call syntax changes. (You can phase in use of the SAX2 namespace features incrementally, as you verify that it's safe for use with each existing application's data set or messaging partner.)