|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.w3c.mwi.mobileok.basic.DecodedContent org.w3c.mwi.mobileok.basic.TextContent org.w3c.mwi.mobileok.basic.XhtmlContent
public class XhtmlContent
Represents an XHTML resource.
The class extends TextContent
to add XHTML markup validation
A tidied version of the resource, created by a Parser
is used
when the resource is not a valid XHTML one. Tidying the resource may not
always be possible.
Nested Class Summary | |
---|---|
private static class |
XhtmlContent.CommonCatalogResolver
|
private static class |
XhtmlContent.MinimizeHandler
|
private class |
XhtmlContent.XHTMLBasicCatalogResolver
|
private static class |
XhtmlContent.XHTMLCatalogResolver
|
private class |
XhtmlContent.XHTMLMPCatalogResolver
|
Field Summary | |
---|---|
private java.net.URI |
baseUri
Base URI of the HTML document. |
private static java.util.regex.Pattern |
commentPattern
Regular expression to extract comments. |
private static java.util.regex.Pattern |
doctypePattern
Regular expression to extract a generic DOCTYPE declaration. |
private org.w3c.dom.Document |
dom
Resource content represented as a DOM tree. |
private int |
extraneousChars
Total number of ignorable whitespace and comments characters. |
private boolean |
hasXmlDeclaration
True when the document contains an XML declaration. |
private boolean |
hasXmlNamespace
True when the document defines the XHTML namespace. |
private static java.util.regex.Pattern |
htmlDoctypePattern
Regular expression to extract an HTML DOCTYPE declaration. |
private static java.util.regex.Pattern |
htmlRootPattern
Regular expression to extract the html root element. |
private java.util.List<ValidationLineAndColumnMessage> |
markupErrorMessageList
List of markup validation errors when the document is validated against its declared DTD. |
private XHTMLValidationStatus |
markupValidationStatus
Final validation status when the document is validated against its declared DTD. |
private java.util.List<ValidationLineAndColumnMessage> |
mobileErrorMessageList
List of markup validation errors when the document is validated against the XHTML Basic 1.1 DTD (or XHTML MP 1.2 DTD). |
private XHTMLValidationStatus |
mobileValidationStatus
Final validation status when the document is validated against the XHTML Basic 1.1 DTD (or XHTML MP 1.2 DTD). |
private java.lang.String |
publicDoctype
Public ID of the DOCTYPE declaration. |
private int |
rootElementLine
Index of the line that contains the html element declaration,
0 when the element is not defined. |
private java.lang.String |
systemDoctype
System ID of the DOCTYPE declaration. |
private int |
totalChars
Total number of characters. |
private static java.util.regex.Pattern |
xmlDeclaration
Regular expression to extract the XML declaration line. |
private static java.util.regex.Pattern |
xmlNamespace
Regular expression to check that the XHTML namespace is defined on the html element. |
Constructor Summary | |
---|---|
XhtmlContent(java.net.URI uri,
java.util.List<RetrievalElement> retrieved)
Creates a class instance bound to a URI. |
Method Summary | |
---|---|
static int |
countExtraneousChars(char[] ch,
int start,
int length)
Returns the number of redundant whitespaces in the given byte array. |
private static int |
findRootElementLine(java.lang.String body)
Computes the index of the line that contains the html
declaration in the resource's content. |
java.net.URI |
getBaseUri()
Returns the base URI of the document, i.e. |
org.w3c.dom.Document |
getDOM()
Resource content represented as a DOM tree. |
int |
getExtraneousChars()
Returns the total number of ignorable whitespaces and comments characters. |
XHTMLValidationStatus |
getMarkupValid()
Returns the final result of the markup validation against the document's declared DTD. |
java.util.List<ValidationLineAndColumnMessage> |
getMobileErrorMessageList()
Returns the list of markup validation errors against the XHTML Basic 1.1 or XHTML MP 1.2 DTD. |
XHTMLValidationStatus |
getMobileValid()
Returns the markup validation status against the XHTML Basic 1.1 or XHTML MP 1.2 DTD. |
java.lang.String |
getPublicDoctype()
Returns the public ID of the resource's DOCTYPE . |
int |
getRootElementLine()
Returns the index of the line in the resource content that contains the html declaration. |
org.w3c.dom.Document |
getSaxonDocument()
Returns a Saxon-compliant representation of the given DOM document to take advantage of Saxon's features, and in particular the possibility to keep line numbers even when the document is represented as a DOM tree. |
java.lang.String |
getSystemDoctype()
Returns the system ID of the resource's DOCTYPE . |
int |
getTotalChars()
Returns the total number of characters in the resource's content. |
java.util.List<ValidationLineAndColumnMessage> |
getXHTMLErrorMessageList()
Returns the list of markup validation errors against the document's declared DTD. |
boolean |
hasXmlDeclaration()
Returns true when the resource contains an XML declaration. |
boolean |
hasXmlNamespace()
Returns true when the resource references the XHTML namespace in the html declaration. |
private static org.w3c.dom.Document |
parseDOM(java.lang.String body)
Parses the given string as XML and returns the corresponding DOM Document . |
private static org.w3c.dom.Document |
parseTidiedDOM(java.lang.String body)
Parses the given string as not possibly not entirely valid XML and returns the corresponding DOM Document . |
private boolean |
setHasXmlDeclaration()
Parses the beginning of the document in search of an XML declaration. |
private boolean |
setHasXmlNamespace()
Parses the beginning of the document in search of an HTML namespace declaration. |
org.w3c.dom.Node |
toMokiNode(org.w3c.dom.Document document,
org.w3c.dom.Node parent)
Serializes the content to its moki representation as a DOM
node. |
private XHTMLValidationStatus |
validateMarkup()
Validates the document against its declared DOCTYPE , when known. |
private XHTMLValidationStatus |
validateMobile()
Validates the document against mobileOK recommended mobile DTDs . |
Methods inherited from class org.w3c.mwi.mobileok.basic.TextContent |
---|
getBody, getUTF8ErrorMessageList, isValid |
Methods inherited from class org.w3c.mwi.mobileok.basic.DecodedContent |
---|
addByteErrorMessages, addLineAndColumnMessages |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private static final java.util.regex.Pattern xmlDeclaration
private static final java.util.regex.Pattern xmlNamespace
html
element.
private static final java.util.regex.Pattern commentPattern
private static final java.util.regex.Pattern htmlDoctypePattern
HTML DOCTYPE
declaration.
private static final java.util.regex.Pattern htmlRootPattern
html
root element.
private static final java.util.regex.Pattern doctypePattern
DOCTYPE
declaration.
private org.w3c.dom.Document dom
The DOM tree may represent a tidied version of the content.
If null
, it means the content could not be tidied.
private int rootElementLine
html
element declaration,
0
when the element is not defined.
private java.net.URI baseUri
Equals the URI of the resource unless a base
element
redefines it in the head
section.
private java.util.List<ValidationLineAndColumnMessage> mobileErrorMessageList
private java.util.List<ValidationLineAndColumnMessage> markupErrorMessageList
private XHTMLValidationStatus markupValidationStatus
private XHTMLValidationStatus mobileValidationStatus
private boolean hasXmlDeclaration
private boolean hasXmlNamespace
private java.lang.String publicDoctype
DOCTYPE
declaration.
private java.lang.String systemDoctype
DOCTYPE
declaration.
private int extraneousChars
private int totalChars
Constructor Detail |
---|
public XhtmlContent(java.net.URI uri, java.util.List<RetrievalElement> retrieved) throws TestException
The content is validated while it is instantiated.
uri
- absolute URI of the resource.retrieved
- the retrieved representation of the resource.
TestException
- an unexpected error occurredMethod Detail |
---|
private static org.w3c.dom.Document parseDOM(java.lang.String body) throws TestException
Document
.
body
- XML string to parse.
null
when the string is not valid XML.
TestException
- an unexpected error occurred.private static org.w3c.dom.Document parseTidiedDOM(java.lang.String body) throws TestException
Document
.
Attempts are made to tidy up the string to make it valid. Tidying is not a guaranteed process and parsing may thus not be possible.
body
- invalid XML string to parse.
null
when the string could not be tidied.
TestException
- an unexpected error occurred.public org.w3c.dom.Document getDOM()
If null
, it means the content could not be parsed
(e.g. because it is malformed).
public int getRootElementLine()
html
declaration.
html
line, 0 if not found.public java.util.List<ValidationLineAndColumnMessage> getXHTMLErrorMessageList()
public java.util.List<ValidationLineAndColumnMessage> getMobileErrorMessageList()
public XHTMLValidationStatus getMarkupValid()
public XHTMLValidationStatus getMobileValid()
public java.lang.String getPublicDoctype()
DOCTYPE
.
DOCTYPE
ID, null if not defined.public java.lang.String getSystemDoctype()
DOCTYPE
.
DOCTYPE
ID, null if not defined.public boolean hasXmlDeclaration()
public org.w3c.dom.Node toMokiNode(org.w3c.dom.Document document, org.w3c.dom.Node parent)
moki
representation as a DOM
node.
toMokiNode
in class TextContent
document
- DOM document the created node should belong toparent
- DOM node to which the representation should be appended.
public int getTotalChars()
public int getExtraneousChars()
public boolean hasXmlNamespace()
html
declaration.
html
element contains the namespace, false otherwise.public java.net.URI getBaseUri()
base
element.
private XHTMLValidationStatus validateMarkup() throws TestException
DOCTYPE
, when known.
The list of well-known DTDs
is managed in a catalog
resolver (see ExtendedCatalogResolver
).
XHTMLValidationStatus.NOT_VALIDATED
is returned when the
DTD
is unknown or not defined.
Validation errors are kept in this instance for
later retrieval through a call to getXHTMLErrorMessageList()
.
XHTMLValidationStatus.NOT_VALIDATED
when the
DTD
is unknown or not defined.
TestException
- an unexpected error occurred.private XHTMLValidationStatus validateMobile() throws TestException
DTDs
.
The method attempts to validate the document against the XHTML Basic 1.1 DTD. If the validation fails, it then tries to validate the document against the XHTML MP 1.2 DTD.
Validation errors are kept in instances of this class for
later retrieval through a call to getMobileErrorMessageList()
.
The list of errors contained in that list corresponds to the validation
against the XHTML Basic 1.1 DTD.
NB: apart from the start
attribute of
the ol
element, XHTML Basic 1.1 is a superset of XHTML MP 1.2,
and so the validation against XHTML MP 1.2 is solely performed to take
that exception to the rule into consideration.
TestException
- an unexpected error occurred.private static int findRootElementLine(java.lang.String body)
html
declaration in the resource's content.
body
- content of the resource to parse
html
declaration, 0 if not found.public org.w3c.dom.Document getSaxonDocument() throws TestException
TestException
- an unexpected error occurred while building the document.private boolean setHasXmlDeclaration() throws java.io.IOException
true
when the document contains an XML declaration,
false
otherwise.
java.io.IOException
- the document body could not be read.private boolean setHasXmlNamespace() throws java.io.IOException
true
when the document contains an HTML namespace declaration,
false
otherwise.
java.io.IOException
- the document body could not be read.public static int countExtraneousChars(char[] ch, int start, int length)
ch
- the byte array to parse.start
- position where search should begin
(must be superior than or equal to 0).length
- position where search should end
(start+length
must be inferior than or equal
to the length of the byte array)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |