public final class

UCharacterProperty

extends Object
implements Trie.DataManipulate
java.lang.Object
   ↳ sun.text.normalizer.UCharacterProperty

Class Overview

Internal class used for Unicode character property database.

This classes store binary data read from uprops.icu. It does not have the capability to parse the data into more high-level information. It only returns bytes of information when required.

Due to the form most commonly used for retrieval, array of char is used to store the binary data.

UCharacterPropertyDB also contains information on accessing indexes to significant points in the binary data.

Responsibility for molding the binary data into more meaning form lies on UCharacter.

Summary

Constants
int EXCEPTION_MASK Exception test mask
int EXC_CASE_FOLDING_ Exception indicator for case folding type
int EXC_COMBINING_CLASS_ EXC_COMBINING_CLASS_ is not found in ICU.
int EXC_DENOMINATOR_VALUE_ Exception indicator for denominator type
int EXC_LOWERCASE_ Exception indicator for lowercase type
int EXC_MIRROR_MAPPING_ Exception indicator for mirror type
int EXC_NUMERIC_VALUE_ Exception indicator for numeric type
int EXC_SPECIAL_CASING_ Exception indicator for special casing type
int EXC_TITLECASE_ Exception indicator for titlecase type
int EXC_UNUSED_ Exception indicator for digit type
int EXC_UPPERCASE_ Exception indicator for uppercase type
char LATIN_SMALL_LETTER_I_ Latin lowercase i
int TYPE_MASK Character type mask
Fields
public int[] m_property_ Character property table
public char[] m_trieData_ Optimization CharTrie data array
public char[] m_trieIndex_ Optimization CharTrie index array
public int m_trieInitialValue_ Optimization CharTrie data offset
public CharTrie m_trie_ Trie data
public VersionInfo m_unicodeVersion_ Unicode version
Public Methods
UnicodeSet addPropertyStarts(UnicodeSet set)
int getAdditional(int codepoint)
Gets the unicode additional properties.
VersionInfo getAge(int codepoint)

Get the "age" of the code point.

int getException(int index, int etype)
Gets the exception value at the index, assuming that data type is available.
static int getExceptionIndex(int prop)
Getting the exception index for argument property
void getFoldCase(int index, int count, StringBuffer str)
Gets the folded case value at the index
int getFoldingOffset(int value)
Called by com.ibm.icu.util.Trie to extract from a lead surrogate's data the index array offset of the indexes for that lead surrogate.
UnicodeSet getInclusions()
static UCharacterProperty getInstance()
Loads the property data and initialize the UCharacterProperty instance.
int getProperty(int ch)
Gets the property value at the index.
static int getRawSupplementary(char lead, char trail)
Forms a supplementary code point from the argument character
Note this is for internal use hence no checks for the validity of the surrogate characters are done
static int getSignedValue(int prop)
Getting the signed numeric value of a character embedded in the property argument
boolean hasExceptionValue(int index, int indicator)
Determines if the exception value passed in has the kind of information which the indicator wants, e.g if the exception value contains the digit value of the character
static boolean isRuleWhiteSpace(int c)
Checks if the argument c is to be treated as a white space in ICU rules.
void setIndexData(CharTrie.FriendAgent friendagent)
Java friends implementation
[Expand]
Inherited Methods
From class java.lang.Object
From interface sun.text.normalizer.Trie.DataManipulate

Constants

public static final int EXCEPTION_MASK

Exception test mask

Constant Value: 32 (0x00000020)

public static final int EXC_CASE_FOLDING_

Exception indicator for case folding type

Constant Value: 8 (0x00000008)

public static final int EXC_COMBINING_CLASS_

EXC_COMBINING_CLASS_ is not found in ICU. Used to retrieve the combining class of the character in the exception value

Constant Value: 9 (0x00000009)

public static final int EXC_DENOMINATOR_VALUE_

Exception indicator for denominator type

Constant Value: 5 (0x00000005)

public static final int EXC_LOWERCASE_

Exception indicator for lowercase type

Constant Value: 1 (0x00000001)

public static final int EXC_MIRROR_MAPPING_

Exception indicator for mirror type

Constant Value: 6 (0x00000006)

public static final int EXC_NUMERIC_VALUE_

Exception indicator for numeric type

Constant Value: 4 (0x00000004)

public static final int EXC_SPECIAL_CASING_

Exception indicator for special casing type

Constant Value: 7 (0x00000007)

public static final int EXC_TITLECASE_

Exception indicator for titlecase type

Constant Value: 2 (0x00000002)

public static final int EXC_UNUSED_

Exception indicator for digit type

Constant Value: 3 (0x00000003)

public static final int EXC_UPPERCASE_

Exception indicator for uppercase type

Constant Value: 0 (0x00000000)

public static final char LATIN_SMALL_LETTER_I_

Latin lowercase i

Constant Value: 105 (0x00000069)

public static final int TYPE_MASK

Character type mask

Constant Value: 31 (0x0000001f)

Fields

public int[] m_property_

Character property table

public char[] m_trieData_

Optimization CharTrie data array

public char[] m_trieIndex_

Optimization CharTrie index array

public int m_trieInitialValue_

Optimization CharTrie data offset

public CharTrie m_trie_

Trie data

public VersionInfo m_unicodeVersion_

Unicode version

Public Methods

public UnicodeSet addPropertyStarts (UnicodeSet set)

public int getAdditional (int codepoint)

Gets the unicode additional properties. C version getUnicodeProperties.

Parameters
codepoint codepoint whose additional properties is to be retrieved
Returns
  • unicode properties

public VersionInfo getAge (int codepoint)

Get the "age" of the code point.

The "age" is the Unicode version when the code point was first designated (as a non-character or for Private Use) or assigned a character.

This can be useful to avoid emitting code points to receiving processes that do not accept newer characters.

The data is from the UCD file DerivedAge.txt.

This API does not check the validity of the codepoint.

Parameters
codepoint The code point.
Returns
  • the Unicode version number

public int getException (int index, int etype)

Gets the exception value at the index, assuming that data type is available. Result is undefined if data is not available. Use hasExceptionValue() to determine data's availability.

Parameters
etype exception data type
Returns
  • exception data type value at index

public static int getExceptionIndex (int prop)

Getting the exception index for argument property

Parameters
prop character property
Returns
  • exception index

public void getFoldCase (int index, int count, StringBuffer str)

Gets the folded case value at the index

Parameters
index of the case value to be retrieved
count number of characters to retrieve
str string buffer to which to append the result

public int getFoldingOffset (int value)

Called by com.ibm.icu.util.Trie to extract from a lead surrogate's data the index array offset of the indexes for that lead surrogate.

Parameters
value data value for a surrogate from the trie, including the folding offset
Returns
  • data offset or 0 if there is no data for the lead surrogate

public UnicodeSet getInclusions ()

public static UCharacterProperty getInstance ()

Loads the property data and initialize the UCharacterProperty instance.

Throws
RuntimeException when data is missing or data has been corrupted

public int getProperty (int ch)

Gets the property value at the index. This is optimized. Note this is alittle different from CharTrie the index m_trieData_ is never negative.

Parameters
ch code point whose property value is to be retrieved
Returns
  • property value of code point

public static int getRawSupplementary (char lead, char trail)

Forms a supplementary code point from the argument character
Note this is for internal use hence no checks for the validity of the surrogate characters are done

Parameters
lead lead surrogate character
trail trailing surrogate character
Returns
  • code point of the supplementary character

public static int getSignedValue (int prop)

Getting the signed numeric value of a character embedded in the property argument

Parameters
prop the character
Returns
  • signed numberic value

public boolean hasExceptionValue (int index, int indicator)

Determines if the exception value passed in has the kind of information which the indicator wants, e.g if the exception value contains the digit value of the character

Parameters
index exception index
indicator type indicator
Returns
  • true if type value exist

public static boolean isRuleWhiteSpace (int c)

Checks if the argument c is to be treated as a white space in ICU rules. Usually ICU rule white spaces are ignored unless quoted.

Parameters
c codepoint to check
Returns
  • true if c is a ICU white space

public void setIndexData (CharTrie.FriendAgent friendagent)

Java friends implementation