public class

XMLChar

extends Object
java.lang.Object
   ↳ org.apache.xerces.util.XMLChar

Class Overview

This class defines the basic XML character properties. The data in this class can be used to verify that a character is a valid XML character or if the character is a space, name start, or name character.

A series of convenience methods are supplied to ease the burden of the developer. Because inlining the checks can improve per character performance, the tables of character properties are public. Using the character as an index into the CHARS array and applying the appropriate mask flag (e.g. MASK_VALID), yields the same results as calling the convenience methods. There is one exception: check the comments for the isValid method for details.

Summary

Constants
int MASK_CONTENT Content character mask.
int MASK_NAME Name character mask.
int MASK_NAME_START Name start character mask.
int MASK_NCNAME NCName character mask.
int MASK_NCNAME_START NCName start character mask.
int MASK_PUBID Pubid character mask.
int MASK_SPACE Space character mask.
int MASK_VALID Valid character mask.
Public Constructors
XMLChar()
Public Methods
static char highSurrogate(int c)
Returns the high surrogate of a supplemental character
static boolean isContent(int c)
Returns true if the specified character can be considered content.
static boolean isHighSurrogate(int c)
Returns whether the given character is a high surrogate
static boolean isInvalid(int c)
Returns true if the specified character is invalid.
static boolean isLowSurrogate(int c)
Returns whether the given character is a low surrogate
static boolean isMarkup(int c)
Returns true if the specified character can be considered markup.
static boolean isNCName(int c)
Returns true if the specified character is a valid NCName character as defined by production [5] in Namespaces in XML recommendation.
static boolean isNCNameStart(int c)
Returns true if the specified character is a valid NCName start character as defined by production [4] in Namespaces in XML recommendation.
static boolean isName(int c)
Returns true if the specified character is a valid name character as defined by production [4] in the XML 1.0 specification.
static boolean isNameStart(int c)
Returns true if the specified character is a valid name start character as defined by production [5] in the XML 1.0 specification.
static boolean isPubid(int c)
Returns true if the specified character is a valid Pubid character as defined by production [13] in the XML 1.0 specification.
static boolean isSpace(int c)
Returns true if the specified character is a space character as defined by production [3] in the XML 1.0 specification.
static boolean isSupplemental(int c)
Returns true if the specified character is a supplemental character.
static boolean isValid(int c)
Returns true if the specified character is valid.
static boolean isValidIANAEncoding(String ianaEncoding)
Returns true if the encoding name is a valid IANA encoding.
static boolean isValidJavaEncoding(String javaEncoding)
Returns true if the encoding name is a valid Java encoding.
static boolean isValidNCName(String ncName)
Check to see if a string is a valid NCName according to [4] from the XML Namespaces 1.0 Recommendation
static boolean isValidName(String name)
Check to see if a string is a valid Name according to [5] in the XML 1.0 Recommendation
static boolean isValidNmtoken(String nmtoken)
Check to see if a string is a valid Nmtoken according to [7] in the XML 1.0 Recommendation
static char lowSurrogate(int c)
Returns the low surrogate of a supplemental character
static int supplemental(char h, char l)
Returns true the supplemental character corresponding to the given surrogates.
static String trim(String value)
Trims space characters as defined by production [3] in the XML 1.0 specification from both ends of the given string.
[Expand]
Inherited Methods
From class java.lang.Object

Constants

public static final int MASK_CONTENT

Content character mask. Special characters are those that can be considered the start of markup, such as '<' and '&'. The various newline characters are considered special as well. All other valid XML characters can be considered content.

This is an optimization for the inner loop of character scanning.

Constant Value: 32 (0x00000020)

public static final int MASK_NAME

Name character mask.

Constant Value: 8 (0x00000008)

public static final int MASK_NAME_START

Name start character mask.

Constant Value: 4 (0x00000004)

public static final int MASK_NCNAME

NCName character mask.

Constant Value: 128 (0x00000080)

public static final int MASK_NCNAME_START

NCName start character mask.

Constant Value: 64 (0x00000040)

public static final int MASK_PUBID

Pubid character mask.

Constant Value: 16 (0x00000010)

public static final int MASK_SPACE

Space character mask.

Constant Value: 2 (0x00000002)

public static final int MASK_VALID

Valid character mask.

Constant Value: 1 (0x00000001)

Public Constructors

public XMLChar ()

Public Methods

public static char highSurrogate (int c)

Returns the high surrogate of a supplemental character

Parameters
c The supplemental character to "split".

public static boolean isContent (int c)

Returns true if the specified character can be considered content.

Parameters
c The character to check.

public static boolean isHighSurrogate (int c)

Returns whether the given character is a high surrogate

Parameters
c The character to check.

public static boolean isInvalid (int c)

Returns true if the specified character is invalid.

Parameters
c The character to check.

public static boolean isLowSurrogate (int c)

Returns whether the given character is a low surrogate

Parameters
c The character to check.

public static boolean isMarkup (int c)

Returns true if the specified character can be considered markup. Markup characters include '<', '&', and '%'.

Parameters
c The character to check.

public static boolean isNCName (int c)

Returns true if the specified character is a valid NCName character as defined by production [5] in Namespaces in XML recommendation.

Parameters
c The character to check.

public static boolean isNCNameStart (int c)

Returns true if the specified character is a valid NCName start character as defined by production [4] in Namespaces in XML recommendation.

Parameters
c The character to check.

public static boolean isName (int c)

Returns true if the specified character is a valid name character as defined by production [4] in the XML 1.0 specification.

Parameters
c The character to check.

public static boolean isNameStart (int c)

Returns true if the specified character is a valid name start character as defined by production [5] in the XML 1.0 specification.

Parameters
c The character to check.

public static boolean isPubid (int c)

Returns true if the specified character is a valid Pubid character as defined by production [13] in the XML 1.0 specification.

Parameters
c The character to check.

public static boolean isSpace (int c)

Returns true if the specified character is a space character as defined by production [3] in the XML 1.0 specification.

Parameters
c The character to check.

public static boolean isSupplemental (int c)

Returns true if the specified character is a supplemental character.

Parameters
c The character to check.

public static boolean isValid (int c)

Returns true if the specified character is valid. This method also checks the surrogate character range from 0x10000 to 0x10FFFF.

If the program chooses to apply the mask directly to the CHARS array, then they are responsible for checking the surrogate character range.

Parameters
c The character to check.

public static boolean isValidIANAEncoding (String ianaEncoding)

Returns true if the encoding name is a valid IANA encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an IANA encoding name.

Parameters
ianaEncoding The IANA encoding name.

public static boolean isValidJavaEncoding (String javaEncoding)

Returns true if the encoding name is a valid Java encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an Java encoding name.

Parameters
javaEncoding The Java encoding name.

public static boolean isValidNCName (String ncName)

Check to see if a string is a valid NCName according to [4] from the XML Namespaces 1.0 Recommendation

Parameters
ncName string to check
Returns
  • true if name is a valid NCName

public static boolean isValidName (String name)

Check to see if a string is a valid Name according to [5] in the XML 1.0 Recommendation

Parameters
name string to check
Returns
  • true if name is a valid Name

public static boolean isValidNmtoken (String nmtoken)

Check to see if a string is a valid Nmtoken according to [7] in the XML 1.0 Recommendation

Parameters
nmtoken string to check
Returns
  • true if nmtoken is a valid Nmtoken

public static char lowSurrogate (int c)

Returns the low surrogate of a supplemental character

Parameters
c The supplemental character to "split".

public static int supplemental (char h, char l)

Returns true the supplemental character corresponding to the given surrogates.

Parameters
h The high surrogate.
l The low surrogate.

public static String trim (String value)

Trims space characters as defined by production [3] in the XML 1.0 specification from both ends of the given string.

Parameters
value the string to be trimmed
Returns
  • the given string with the space characters trimmed from both ends