public class

HTMLSerializer

extends BaseMarkupSerializer
java.lang.Object
   ↳ org.apache.xml.serialize.BaseMarkupSerializer
     ↳ org.apache.xml.serialize.HTMLSerializer
Known Direct Subclasses

This class is deprecated.
This class was deprecated in Xerces 2.6.2. It is recommended that new applications use JAXP's Transformation API for XML (TrAX) for serializing HTML. See the Xerces documentation for more information.

Class Overview

Implements an HTML/XHTML serializer supporting both DOM and SAX pretty serializing. HTML/XHTML mode is determined in the constructor. For usage instructions see Serializer.

If an output stream is used, the encoding is taken from the output format (defaults to UTF-8). If a writer is used, make sure the writer uses the same encoding (if applies) as specified in the output format.

The serializer supports both DOM and SAX. DOM serializing is done by calling serialize(Document) and SAX serializing is done by firing SAX events and using the serializer as a document handler.

If an I/O exception occurs while serializing, the serializer will not throw an exception directly, but only throw it at the end of serializing (either DOM or SAX's endDocument().

For elements that are not specified as whitespace preserving, the serializer will potentially break long text lines at space boundaries, indent lines, and serialize elements on separate lines. Line terminators will be regarded as spaces, and spaces at beginning of line will be stripped.

XHTML is slightly different than HTML:

  • Element/attribute names are lower case and case matters
  • Attributes must specify value, even if empty string
  • Empty elements must have '/' in empty tag
  • Contents of SCRIPT and STYLE elements serialized as CDATA

See Also

Summary

Constants
String XHTMLNamespace
[Expand]
Inherited Fields
From class org.apache.xml.serialize.BaseMarkupSerializer
Public Constructors
HTMLSerializer()
Constructs a new serializer.
HTMLSerializer(OutputFormat format)
Constructs a new serializer.
HTMLSerializer(Writer writer, OutputFormat format)
Constructs a new serializer that writes to the specified writer using the specified output format.
HTMLSerializer(OutputStream output, OutputFormat format)
Constructs a new serializer that writes to the specified output stream using the specified output format.
Protected Constructors
HTMLSerializer(boolean xhtml, OutputFormat format)
Constructs a new HTML/XHTML serializer depending on the value of xhtml.
Public Methods
void characters(char[] chars, int start, int length)
void endElement(String tagName)
void endElement(String namespaceURI, String localName, String rawName)
void endElementIO(String namespaceURI, String localName, String rawName)
void setOutputFormat(OutputFormat format)
Specifies an output format for this serializer.
void setXHTMLNamespace(String newNamespace)
void startElement(String tagName, AttributeList attrs)
void startElement(String namespaceURI, String localName, String rawName, Attributes attrs)
Protected Methods
void characters(String text)
Called to print the text contents in the prevailing element format.
String escapeURI(String uri)
String getEntityRef(int ch)
Returns the suitable entity reference for this character value, or null if no such entity exists.
void serializeElement(Element elem)
Called to serialize a DOM element.
void startDocument(String rootTagName)
Called to serialize the document's DOCTYPE by the root element.
[Expand]
Inherited Methods
From class org.apache.xml.serialize.BaseMarkupSerializer
From class java.lang.Object
From interface org.apache.xml.serialize.DOMSerializer
From interface org.apache.xml.serialize.Serializer
From interface org.xml.sax.ContentHandler
From interface org.xml.sax.DTDHandler
From interface org.xml.sax.DocumentHandler
From interface org.xml.sax.ext.DeclHandler
From interface org.xml.sax.ext.LexicalHandler

Constants

public static final String XHTMLNamespace

Constant Value: "http://www.w3.org/1999/xhtml"

Public Constructors

public HTMLSerializer ()

Constructs a new serializer. The serializer cannot be used without calling setOutputCharStream(Writer) or setOutputByteStream(OutputStream) first.

public HTMLSerializer (OutputFormat format)

Constructs a new serializer. The serializer cannot be used without calling setOutputCharStream(Writer) or setOutputByteStream(OutputStream) first.

public HTMLSerializer (Writer writer, OutputFormat format)

Constructs a new serializer that writes to the specified writer using the specified output format. If format is null, will use a default output format.

Parameters
writer The writer to use
format The output format to use, null for the default

public HTMLSerializer (OutputStream output, OutputFormat format)

Constructs a new serializer that writes to the specified output stream using the specified output format. If format is null, will use a default output format.

Parameters
output The output stream to use
format The output format to use, null for the default

Protected Constructors

protected HTMLSerializer (boolean xhtml, OutputFormat format)

Constructs a new HTML/XHTML serializer depending on the value of xhtml. The serializer cannot be used without calling setOutputCharStream(Writer) or setOutputByteStream(OutputStream) first.

Parameters
xhtml True if XHTML serializing

Public Methods

public void characters (char[] chars, int start, int length)

Throws
SAXException

public void endElement (String tagName)

Throws
SAXException

public void endElement (String namespaceURI, String localName, String rawName)

Throws
SAXException

public void endElementIO (String namespaceURI, String localName, String rawName)

Throws
IOException

public void setOutputFormat (OutputFormat format)

Specifies an output format for this serializer. It the serializer has already been associated with an output format, it will switch to the new format. This method should not be called while the serializer is in the process of serializing a document.

Parameters
format The output format to use

public void setXHTMLNamespace (String newNamespace)

public void startElement (String tagName, AttributeList attrs)

Throws
SAXException

public void startElement (String namespaceURI, String localName, String rawName, Attributes attrs)

Throws
SAXException

Protected Methods

protected void characters (String text)

Called to print the text contents in the prevailing element format. Since this method is capable of printing text as CDATA, it is used for that purpose as well. White space handling is determined by the current element state. In addition, the output format can dictate whether the text is printed as CDATA or unescaped.

Parameters
text The text to print
Throws
IOException

protected String escapeURI (String uri)

protected String getEntityRef (int ch)

Returns the suitable entity reference for this character value, or null if no such entity exists. Calling this method with '&' will return "&".

Parameters
ch Character value
Returns
  • Character entity name, or null

protected void serializeElement (Element elem)

Called to serialize a DOM element. Equivalent to calling startElement(String, String, String, Attributes), endElement(String) and serializing everything inbetween, but better optimized.

Parameters
elem The element to serialize
Throws
IOException

protected void startDocument (String rootTagName)

Called to serialize the document's DOCTYPE by the root element. The document type declaration must name the root element, but the root element is only known when that element is serialized, and not at the start of the document.

This method will check if it has not been called before (_started), will serialize the document type declaration, and will serialize all pre-root comments and PIs that were accumulated in the document (see serializePreRoot()). Pre-root will be serialized even if this is not the first root element of the document.

Throws
IOException