Summary: Nested Classes | Constants | Methods | Inherited Methods | [Expand All]

public final class

NumericUtils

extends Object

java.lang.Object
↳	org.apache.lucene.util.NumericUtils

Class Overview

This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.

To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.

This class generates terms to achieve this: First the numerical integer values need to be converted to strings. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting string is sortable like the original integer value. Each value is also prefixed (in the first char) by the shift value (number of bits removed) used during encoding.

To also index floating point numbers, this class supplies two methods to convert them to integer values by changing their bit layout: doubleToSortableLong(double), floatToSortableInt(float). You will have no precision loss by converting floating point numbers to integers and back (only that the integer form is not usable). Other data types like dates can easily converted to longs or ints (e.g. date to long: getTime()).

For easy usage, the trie algorithm is implemented for indexing inside NumericTokenStream that can index int, long, float, and double. For querying, NumericRangeQuery and NumericRangeFilter implement the query part for the same data types.

This class can also be used, to generate lexicographically sortable (according compareTo(String)) representations of numeric data types for other usages (e.g. sorting).

NOTE: This API is experimental and might change in incompatible ways in the next release.

Summary

Nested Classes
class	NumericUtils.IntRangeBuilder	Expert: Callback for `splitIntRange(NumericUtils.IntRangeBuilder, int, int, int)`.
class	NumericUtils.LongRangeBuilder	Expert: Callback for `splitLongRange(NumericUtils.LongRangeBuilder, int, long, long)`.

Constants
int	BUF_SIZE_INT	Expert: The maximum term length (used for `char[]` buffer size) for encoding `int` values.
int	BUF_SIZE_LONG	Expert: The maximum term length (used for `char[]` buffer size) for encoding `long` values.
int	PRECISION_STEP_DEFAULT	The default precision step used by `NumericField`, `NumericTokenStream`, `NumericRangeQuery`, and `NumericRangeFilter` as default
char	SHIFT_START_INT	Expert: Integers are stored at lower precision by shifting off lower bits.
char	SHIFT_START_LONG	Expert: Longs are stored at lower precision by shifting off lower bits.

Public Methods
static String	doubleToPrefixCoded(double val) Convenience method: this just returns: longToPrefixCoded(doubleToSortableLong(val))
static long	doubleToSortableLong(double val) Converts a `double` value to a sortable signed `long`.
static String	floatToPrefixCoded(float val) Convenience method: this just returns: intToPrefixCoded(floatToSortableInt(val))
static int	floatToSortableInt(float val) Converts a `float` value to a sortable signed `int`.
static String	intToPrefixCoded(int val, int shift) Expert: Returns prefix coded bits after reducing the precision by `shift` bits.
static int	intToPrefixCoded(int val, int shift, char[] buffer) Expert: Returns prefix coded bits after reducing the precision by `shift` bits.
static String	intToPrefixCoded(int val) This is a convenience method, that returns prefix coded bits of an int without reducing the precision.
static String	longToPrefixCoded(long val, int shift) Expert: Returns prefix coded bits after reducing the precision by `shift` bits.
static String	longToPrefixCoded(long val) This is a convenience method, that returns prefix coded bits of a long without reducing the precision.
static int	longToPrefixCoded(long val, int shift, char[] buffer) Expert: Returns prefix coded bits after reducing the precision by `shift` bits.
static double	prefixCodedToDouble(String val) Convenience method: this just returns: sortableLongToDouble(prefixCodedToLong(val))
static float	prefixCodedToFloat(String val) Convenience method: this just returns: sortableIntToFloat(prefixCodedToInt(val))
static int	prefixCodedToInt(String prefixCoded) Returns an int from prefixCoded characters.
static long	prefixCodedToLong(String prefixCoded) Returns a long from prefixCoded characters.
static float	sortableIntToFloat(int val) Converts a sortable `int` back to a `float`.
static double	sortableLongToDouble(long val) Converts a sortable `long` back to a `double`.
static void	splitIntRange(NumericUtils.IntRangeBuilder builder, int precisionStep, int minBound, int maxBound) Expert: Splits an int range recursively.
static void	splitLongRange(NumericUtils.LongRangeBuilder builder, int precisionStep, long minBound, long maxBound) Expert: Splits a long range recursively.

[Expand]

Inherited Methods

From class java.lang.Object

Constants

public static final int BUF_SIZE_INT

Expert: The maximum term length (used for char[] buffer size) for encoding int values.

public static final int BUF_SIZE_LONG

Expert: The maximum term length (used for char[] buffer size) for encoding long values.

public static final int PRECISION_STEP_DEFAULT

The default precision step used by NumericField, NumericTokenStream, NumericRangeQuery, and NumericRangeFilter as default

Constant Value: 4 (0x00000004)

public static final char SHIFT_START_INT

Expert: Integers are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_INT+shift in the first character

Constant Value: 96 (0x00000060)

public static final char SHIFT_START_LONG

Expert: Longs are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_LONG+shift in the first character

Constant Value: 32 (0x00000020)

Public Methods

public static String doubleToPrefixCoded (double val)

Convenience method: this just returns: longToPrefixCoded(doubleToSortableLong(val))

public static long doubleToSortableLong (double val)

Converts a double value to a sortable signed long. The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as long. By this the precision is not reduced, but the value can easily used as a long.

public static String floatToPrefixCoded (float val)

Convenience method: this just returns: intToPrefixCoded(floatToSortableInt(val))

public static int floatToSortableInt (float val)

Converts a float value to a sortable signed int. The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as int. By this the precision is not reduced, but the value can easily used as an int.

public static String intToPrefixCoded (int val, int shift)

Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericUtils.IntRangeBuilder.

Parameters

val	the numeric value
shift	how many bits to strip from the right

public static int intToPrefixCoded (int val, int shift, char[] buffer)

Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream.

Parameters

val	the numeric value
shift	how many bits to strip from the right
buffer	that will contain the encoded chars, must be at least of `BUF_SIZE_INT` length

Returns

number of chars written to buffer

public static String intToPrefixCoded (int val)

This is a convenience method, that returns prefix coded bits of an int without reducing the precision. It can be used to store the full precision value as a stored field in index.

To decode, use prefixCodedToInt(String).

public static String longToPrefixCoded (long val, int shift)

Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericUtils.LongRangeBuilder.

Parameters

val	the numeric value
shift	how many bits to strip from the right

public static String longToPrefixCoded (long val)

This is a convenience method, that returns prefix coded bits of a long without reducing the precision. It can be used to store the full precision value as a stored field in index.

To decode, use prefixCodedToLong(String).

public static int longToPrefixCoded (long val, int shift, char[] buffer)

Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream.

Parameters

val	the numeric value
shift	how many bits to strip from the right
buffer	that will contain the encoded chars, must be at least of `BUF_SIZE_LONG` length

Returns

number of chars written to buffer

public static double prefixCodedToDouble (String val)

Convenience method: this just returns: sortableLongToDouble(prefixCodedToLong(val))

public static float prefixCodedToFloat (String val)

Convenience method: this just returns: sortableIntToFloat(prefixCodedToInt(val))

public static int prefixCodedToInt (String prefixCoded)

Returns an int from prefixCoded characters. Rightmost bits will be zero for lower precision codes. This method can be used to decode e.g. a stored field.

Throws

NumberFormatException	if the supplied string is not correctly prefix encoded.

public static long prefixCodedToLong (String prefixCoded)

Returns a long from prefixCoded characters. Rightmost bits will be zero for lower precision codes. This method can be used to decode e.g. a stored field.

Throws

NumberFormatException	if the supplied string is not correctly prefix encoded.

public static float sortableIntToFloat (int val)

Converts a sortable int back to a float.

public static double sortableLongToDouble (long val)

Converts a sortable long back to a double.

public static void splitIntRange (NumericUtils.IntRangeBuilder builder, int precisionStep, int minBound, int maxBound)

Expert: Splits an int range recursively. You may implement a builder that adds clauses to a BooleanQuery for each call to its addRange(String, String) method.

This method is used by NumericRangeQuery.

public static void splitLongRange (NumericUtils.LongRangeBuilder builder, int precisionStep, long minBound, long maxBound)

Expert: Splits a long range recursively. You may implement a builder that adds clauses to a BooleanQuery for each call to its addRange(String, String) method.

This method is used by NumericRangeQuery.

Interfaces

Classes

Enums

Exceptions

NumericUtils

Class Overview

Summary

Constants

public static final int BUF_SIZE_INT

See Also

public static final int BUF_SIZE_LONG

See Also

public static final int PRECISION_STEP_DEFAULT

public static final char SHIFT_START_INT

public static final char SHIFT_START_LONG

Public Methods

public static String doubleToPrefixCoded (double val)

public static long doubleToSortableLong (double val)

See Also

public static String floatToPrefixCoded (float val)

public static int floatToSortableInt (float val)

See Also

public static String intToPrefixCoded (int val, int shift)

Parameters

public static int intToPrefixCoded (int val, int shift, char[] buffer)

Parameters

Returns

public static String intToPrefixCoded (int val)

public static String longToPrefixCoded (long val, int shift)

Parameters

public static String longToPrefixCoded (long val)

public static int longToPrefixCoded (long val, int shift, char[] buffer)

Parameters

Returns

public static double prefixCodedToDouble (String val)

public static float prefixCodedToFloat (String val)

public static int prefixCodedToInt (String prefixCoded)

Throws

See Also

public static long prefixCodedToLong (String prefixCoded)

Throws

See Also

public static float sortableIntToFloat (int val)

See Also

public static double sortableLongToDouble (long val)

See Also

public static void splitIntRange (NumericUtils.IntRangeBuilder builder, int precisionStep, int minBound, int maxBound)

public static void splitLongRange (NumericUtils.LongRangeBuilder builder, int precisionStep, long minBound, long maxBound)