public class

DOMNormalizer

extends Object
implements XMLDocumentHandler
java.lang.Object
   ↳ org.apache.xerces.dom.DOMNormalizer

Class Overview

This class adds implementation for normalizeDocument method. It acts as if the document was going through a save and load cycle, putting the document in a "normal" form. The actual result depends on the features being set and governing what operations actually take place. See setNormalizationFeature for details. Noticeably this method normalizes Text nodes, makes the document "namespace wellformed", according to the algorithm described below in pseudo code, by adding missing namespace declaration attributes and adding or changing namespace prefixes, updates the replacement tree of EntityReference nodes, normalizes attribute values, etc. Mutation events, when supported, are generated to reflect the changes occuring on the document. See Namespace normalization for details on how namespace declaration attributes and prefixes are normalized. NOTE: There is an initial support for DOM revalidation with XML Schema as a grammar. The tree might not be validated correctly if entityReferences, CDATA sections are present in the tree. The PSVI information is not exposed, normalized data (including element default content is not available).@xerces.experimental

Summary

Nested Classes
class DOMNormalizer.XMLAttributesProxy  
Constants
boolean DEBUG Debug namespace fix up algorithm
boolean DEBUG_EVENTS Debug document handler events
boolean DEBUG_ND Debug normalize document
String PREFIX prefix added by namespace fixup algorithm should follow a pattern "NS" + index
Fields
public static final XMLString EMPTY_STRING Empty string to pass to the validator.
public static final RuntimeException abort If the user stops the process, this exception will be thrown.
protected final DOMNormalizer.XMLAttributesProxy fAttrProxy
protected final Vector fAttributeList list of attributes
protected DOMConfigurationImpl fConfiguration
protected Node fCurrentNode for setting the PSVI
protected CoreDocumentImpl fDocument
protected DOMErrorHandler fErrorHandler error handler.
protected final NamespaceContext fLocalNSBinder Stores all namespace bindings on the current element
protected final DOMLocatorImpl fLocator DOM Locator - for namespace fixup algorithm
protected final NamespaceContext fNamespaceContext The namespace context of this document: stores namespaces in scope
protected boolean fNamespaceValidation
protected boolean fPSVI
protected final QName fQName
protected SymbolTable fSymbolTable symbol table
protected RevalidationHandler fValidationHandler Validation handler represents validator instance.
Public Constructors
DOMNormalizer()
Public Methods
void characters(XMLString text, Augmentations augs)
Character content.
void comment(XMLString text, Augmentations augs)
A comment.
void doctypeDecl(String rootElement, String publicId, String systemId, Augmentations augs)
Notifies of the presence of the DOCTYPE line in the document.
void emptyElement(QName element, XMLAttributes attributes, Augmentations augs)
An empty element.
void endCDATA(Augmentations augs)
The end of a CDATA section.
void endDocument(Augmentations augs)
The end of the document.
void endElement(QName element, Augmentations augs)
The end of an element.
void endGeneralEntity(String name, Augmentations augs)
This method notifies the end of a general entity.
XMLDocumentSource getDocumentSource()
Returns the document source.
void ignorableWhitespace(XMLString text, Augmentations augs)
Ignorable whitespace.
final static void isAttrValueWF(DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, NamedNodeMap attributes, Attr a, String value, boolean xml11Version)
NON-DOM: check if attribute value is well-formed
final static void isCDataWF(DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, String datavalue, boolean isXML11Version)
Check if CDATA section is well-formed
final static void isCommentWF(DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, String datavalue, boolean isXML11Version)
NON-DOM: check if value of the comment is well-formed
final static void isXMLCharWF(DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, String datavalue, boolean isXML11Version)
NON-DOM: check for valid XML characters as per the XML version
void processingInstruction(String target, XMLString data, Augmentations augs)
A processing instruction.
final static void reportDOMError(DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, String message, short severity, String type)
Reports a DOM error to the user handler.
void setDocumentSource(XMLDocumentSource source)
Sets the document source.
void startCDATA(Augmentations augs)
The start of a CDATA section.
void startDocument(XMLLocator locator, String encoding, NamespaceContext namespaceContext, Augmentations augs)
The start of the document.
void startElement(QName element, XMLAttributes attributes, Augmentations augs)
The start of an element.
void startGeneralEntity(String name, XMLResourceIdentifier identifier, String encoding, Augmentations augs)
This method notifies the start of a general entity.
void textDecl(String version, String encoding, Augmentations augs)
Notifies of the presence of a TextDecl line in an entity.
void xmlDecl(String version, String encoding, String standalone, Augmentations augs)
Notifies of the presence of an XMLDecl line in the document.
Protected Methods
final void addNamespaceDecl(String prefix, String uri, ElementImpl element)
Adds a namespace attribute or replaces the value of existing namespace attribute with the given prefix and value for URI.
final void expandEntityRef(Node parent, Node reference)
final void namespaceFixUp(ElementImpl element, AttributeMap attributes)
void normalizeDocument(CoreDocumentImpl document, DOMConfigurationImpl config)
Normalizes document.
Node normalizeNode(Node node)
This method acts as if the document was going through a save and load cycle, putting the document in a "normal" form.
final void updateQName(Node node, QName qname)
[Expand]
Inherited Methods
From class java.lang.Object
From interface org.apache.xerces.xni.XMLDocumentHandler

Constants

protected static final boolean DEBUG

Debug namespace fix up algorithm

Constant Value: false

protected static final boolean DEBUG_EVENTS

Debug document handler events

Constant Value: false

protected static final boolean DEBUG_ND

Debug normalize document

Constant Value: false

protected static final String PREFIX

prefix added by namespace fixup algorithm should follow a pattern "NS" + index

Constant Value: "NS"

Fields

public static final XMLString EMPTY_STRING

Empty string to pass to the validator.

public static final RuntimeException abort

If the user stops the process, this exception will be thrown.

protected final DOMNormalizer.XMLAttributesProxy fAttrProxy

protected final Vector fAttributeList

list of attributes

protected DOMConfigurationImpl fConfiguration

protected Node fCurrentNode

for setting the PSVI

protected CoreDocumentImpl fDocument

protected DOMErrorHandler fErrorHandler

error handler. may be null.

protected final NamespaceContext fLocalNSBinder

Stores all namespace bindings on the current element

protected final DOMLocatorImpl fLocator

DOM Locator - for namespace fixup algorithm

protected final NamespaceContext fNamespaceContext

The namespace context of this document: stores namespaces in scope

protected boolean fNamespaceValidation

protected boolean fPSVI

protected final QName fQName

protected SymbolTable fSymbolTable

symbol table

protected RevalidationHandler fValidationHandler

Validation handler represents validator instance.

Public Constructors

public DOMNormalizer ()

Public Methods

public void characters (XMLString text, Augmentations augs)

Character content.

Parameters
text The content.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void comment (XMLString text, Augmentations augs)

A comment.

Parameters
text The text in the comment.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by application to signal an error.

public void doctypeDecl (String rootElement, String publicId, String systemId, Augmentations augs)

Notifies of the presence of the DOCTYPE line in the document.

Parameters
rootElement The name of the root element.
publicId The public identifier if an external DTD or null if the external DTD is specified using SYSTEM.
systemId The system identifier if an external DTD, null otherwise.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void emptyElement (QName element, XMLAttributes attributes, Augmentations augs)

An empty element.

Parameters
element The name of the element.
attributes The element attributes.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void endCDATA (Augmentations augs)

The end of a CDATA section.

Parameters
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void endDocument (Augmentations augs)

The end of the document.

Parameters
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void endElement (QName element, Augmentations augs)

The end of an element.

Parameters
element The name of the element.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void endGeneralEntity (String name, Augmentations augs)

This method notifies the end of a general entity.

Note: This method is not called for entity references appearing as part of attribute values.

Parameters
name The name of the entity.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public XMLDocumentSource getDocumentSource ()

Returns the document source.

public void ignorableWhitespace (XMLString text, Augmentations augs)

Ignorable whitespace. For this method to be called, the document source must have some way of determining that the text containing only whitespace characters should be considered ignorable. For example, the validator can determine if a length of whitespace characters in the document are ignorable based on the element content model.

Parameters
text The ignorable whitespace.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public static final void isAttrValueWF (DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, NamedNodeMap attributes, Attr a, String value, boolean xml11Version)

NON-DOM: check if attribute value is well-formed

public static final void isCDataWF (DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, String datavalue, boolean isXML11Version)

Check if CDATA section is well-formed

Parameters
isXML11Version = true if XML 1.1

public static final void isCommentWF (DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, String datavalue, boolean isXML11Version)

NON-DOM: check if value of the comment is well-formed

Parameters
isXML11Version = true if XML 1.1

public static final void isXMLCharWF (DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, String datavalue, boolean isXML11Version)

NON-DOM: check for valid XML characters as per the XML version

Parameters
isXML11Version = true if XML 1.1

public void processingInstruction (String target, XMLString data, Augmentations augs)

A processing instruction. Processing instructions consist of a target name and, optionally, text data. The data is only meaningful to the application.

Typically, a processing instruction's data will contain a series of pseudo-attributes. These pseudo-attributes follow the form of element attributes but are not parsed or presented to the application as anything other than text. The application is responsible for parsing the data.

Parameters
target The target.
data The data or null if none specified.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public static final void reportDOMError (DOMErrorHandler errorHandler, DOMErrorImpl error, DOMLocatorImpl locator, String message, short severity, String type)

Reports a DOM error to the user handler. If the error is fatal, the processing will be always aborted.

public void setDocumentSource (XMLDocumentSource source)

Sets the document source.

public void startCDATA (Augmentations augs)

The start of a CDATA section.

Parameters
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void startDocument (XMLLocator locator, String encoding, NamespaceContext namespaceContext, Augmentations augs)

The start of the document.

Parameters
locator The document locator, or null if the document location cannot be reported during the parsing of this document. However, it is strongly recommended that a locator be supplied that can at least report the system identifier of the document.
encoding The auto-detected IANA encoding name of the entity stream. This value will be null in those situations where the entity encoding is not auto-detected (e.g. internal entities or a document entity that is parsed from a java.io.Reader).
namespaceContext The namespace context in effect at the start of this document. This object represents the current context. Implementors of this class are responsible for copying the namespace bindings from the the current context (and its parent contexts) if that information is important.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void startElement (QName element, XMLAttributes attributes, Augmentations augs)

The start of an element.

Parameters
element The name of the element.
attributes The element attributes.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void startGeneralEntity (String name, XMLResourceIdentifier identifier, String encoding, Augmentations augs)

This method notifies the start of a general entity.

Note: This method is not called for entity references appearing as part of attribute values.

Parameters
name The name of the general entity.
identifier The resource identifier.
encoding The auto-detected IANA encoding name of the entity stream. This value will be null in those situations where the entity encoding is not auto-detected (e.g. internal entities or a document entity that is parsed from a java.io.Reader).
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void textDecl (String version, String encoding, Augmentations augs)

Notifies of the presence of a TextDecl line in an entity. If present, this method will be called immediately following the startEntity call.

Note: This method will never be called for the document entity; it is only called for external general entities referenced in document content.

Note: This method is not called for entity references appearing as part of attribute values.

Parameters
version The XML version, or null if not specified.
encoding The IANA encoding name of the entity.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

public void xmlDecl (String version, String encoding, String standalone, Augmentations augs)

Notifies of the presence of an XMLDecl line in the document. If present, this method will be called immediately following the startDocument call.

Parameters
version The XML version.
encoding The IANA encoding name of the document, or null if not specified.
standalone The standalone value, or null if not specified.
augs Additional information that may include infoset augmentations
Throws
XNIException Thrown by handler to signal an error.

Protected Methods

protected final void addNamespaceDecl (String prefix, String uri, ElementImpl element)

Adds a namespace attribute or replaces the value of existing namespace attribute with the given prefix and value for URI. In case prefix is empty will add/update default namespace declaration.

Throws
IOException

protected final void expandEntityRef (Node parent, Node reference)

protected final void namespaceFixUp (ElementImpl element, AttributeMap attributes)

protected void normalizeDocument (CoreDocumentImpl document, DOMConfigurationImpl config)

Normalizes document. Note: reset() must be called before this method.

protected Node normalizeNode (Node node)

This method acts as if the document was going through a save and load cycle, putting the document in a "normal" form. The actual result depends on the features being set and governing what operations actually take place. See setNormalizationFeature for details. Noticeably this method normalizes Text nodes, makes the document "namespace wellformed", according to the algorithm described below in pseudo code, by adding missing namespace declaration attributes and adding or changing namespace prefixes, updates the replacement tree of EntityReference nodes,normalizes attribute values, etc.

Parameters
node Modified node or null. If node is returned, we need to normalize again starting on the node returned.
Returns
  • the normalized Node

protected final void updateQName (Node node, QName qname)