public class

OpenBitSet

extends DocIdSet
implements Serializable Cloneable
java.lang.Object
   ↳ org.apache.lucene.search.DocIdSet
     ↳ org.apache.lucene.util.OpenBitSet
Known Direct Subclasses

Class Overview

An "open" BitSet implementation that allows direct access to the array of words storing the bits.

Unlike java.util.bitset, the fact that bits are packed into an array of longs is part of the interface. This allows efficient implementation of other algorithms by someone other than the author. It also allows one to efficiently implement alternate serialization or interchange formats.

OpenBitSet is faster than java.util.BitSet in most operations and *much* faster at calculating cardinality of sets and results of set operations. It can also handle sets of larger cardinality (up to 64 * 2**32-1)

The goals of OpenBitSet are the fastest implementation possible, and maximum code reuse. Extra safety and encapsulation may always be built on top, but if that's built in, the cost can never be removed (and hence people re-implement their own version in order to get better performance). If you want a "safe", totally encapsulated (and slower and limited) BitSet class, use java.util.BitSet.

Performance Results

Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
cardinality intersect_count union nextSetBit get iterator
50% full 3.36 3.96 1.44 1.46 1.99 1.58
1% full 3.31 3.90   1.04   0.99

Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
cardinality intersect_count union nextSetBit get iterator
50% full 2.50 3.50 1.00 1.03 1.12 1.25
1% full 2.51 3.49   1.00   1.02

Summary

Fields
protected long[] bits
protected int wlen
[Expand]
Inherited Fields
From class org.apache.lucene.search.DocIdSet
Public Constructors
OpenBitSet(long numBits)
Constructs an OpenBitSet large enough to hold numBits.
OpenBitSet()
OpenBitSet(long[] bits, int numWords)
Constructs an OpenBitSet from an existing long[].
Public Methods
void and(OpenBitSet other)
void andNot(OpenBitSet other)
static long andNotCount(OpenBitSet a, OpenBitSet b)
Returns the popcount or cardinality of "a and not b" or "intersection(a, not(b))".
static int bits2words(long numBits)
returns the number of 64 bit words it would take to hold numBits
long capacity()
Returns the current capacity in bits (1 greater than the index of the last bit)
long cardinality()
void clear(int startIndex, int endIndex)
Clears a range of bits.
void clear(long index)
clears a bit, allowing access beyond the current set size without changing the size.
void clear(long startIndex, long endIndex)
Clears a range of bits.
Object clone()
void ensureCapacity(long numBits)
Ensure that the long[] is big enough to hold numBits, expanding it if necessary.
void ensureCapacityWords(int numWords)
Expand the long[] with the size given as a number of words (64 bit longs).
boolean equals(Object o)
returns true if both sets have the same bits set
void fastClear(long index)
clears a bit.
void fastClear(int index)
clears a bit.
void fastFlip(int index)
flips a bit.
void fastFlip(long index)
flips a bit.
boolean fastGet(int index)
Returns true or false for the specified bit index.
boolean fastGet(long index)
Returns true or false for the specified bit index.
void fastSet(long index)
Sets the bit at the specified index.
void fastSet(int index)
Sets the bit at the specified index.
void flip(long index)
flips a bit, expanding the set size if necessary
void flip(long startIndex, long endIndex)
Flips a range of bits, expanding the set size if necessary
boolean flipAndGet(int index)
flips a bit and returns the resulting bit value.
boolean flipAndGet(long index)
flips a bit and returns the resulting bit value.
boolean get(long index)
Returns true or false for the specified bit index
boolean get(int index)
Returns true or false for the specified bit index.
boolean getAndSet(int index)
Sets a bit and returns the previous value.
boolean getAndSet(long index)
Sets a bit and returns the previous value.
int getBit(int index)
returns 1 if the bit is set, 0 if not.
long[] getBits()
Expert: returns the long[] storing the bits
int getNumWords()
Expert: gets the number of longs in the array that are in use
int hashCode()
void intersect(OpenBitSet other)
this = this AND other
static long intersectionCount(OpenBitSet a, OpenBitSet b)
Returns the popcount or cardinality of the intersection of the two sets.
boolean intersects(OpenBitSet other)
returns true if the sets have any elements in common
boolean isCacheable()
This DocIdSet implementation is cacheable.
boolean isEmpty()
Returns true if there are no set bits
DocIdSetIterator iterator()
Provides a DocIdSetIterator to access the set.
int nextSetBit(int index)
Returns the index of the first set bit starting at the index specified.
long nextSetBit(long index)
Returns the index of the first set bit starting at the index specified.
void or(OpenBitSet other)
void remove(OpenBitSet other)
Remove all elements set in other.
void set(long startIndex, long endIndex)
Sets a range of bits, expanding the set size if necessary
void set(long index)
sets a bit, expanding the set size if necessary
void setBits(long[] bits)
Expert: sets a new long[] to use as the bit storage
void setNumWords(int nWords)
Expert: sets the number of longs in the array that are in use
long size()
Returns the current capacity of this set.
void trimTrailingZeros()
Lowers numWords, the number of words in use, by checking for trailing zero words.
void union(OpenBitSet other)
this = this OR other
static long unionCount(OpenBitSet a, OpenBitSet b)
Returns the popcount or cardinality of the union of the two sets.
void xor(OpenBitSet other)
this = this XOR other
static long xorCount(OpenBitSet a, OpenBitSet b)
Returns the popcount or cardinality of the exclusive-or of the two sets.
Protected Methods
int expandingWordNum(long index)
[Expand]
Inherited Methods
From class org.apache.lucene.search.DocIdSet
From class java.lang.Object

Fields

protected long[] bits

protected int wlen

Public Constructors

public OpenBitSet (long numBits)

Constructs an OpenBitSet large enough to hold numBits.

public OpenBitSet ()

public OpenBitSet (long[] bits, int numWords)

Constructs an OpenBitSet from an existing long[].
The first 64 bits are in long[0], with bit index 0 at the least significant bit, and bit index 63 at the most significant. Given a bit index, the word containing it is long[index/64], and it is at bit number index%64 within that word.

numWords are the number of elements in the array that contain set bits (non-zero longs). numWords should be <= bits.length, and any existing words in the array at position >= numWords should be zero.

Public Methods

public void and (OpenBitSet other)

public void andNot (OpenBitSet other)

public static long andNotCount (OpenBitSet a, OpenBitSet b)

Returns the popcount or cardinality of "a and not b" or "intersection(a, not(b))". Neither set is modified.

public static int bits2words (long numBits)

returns the number of 64 bit words it would take to hold numBits

public long capacity ()

Returns the current capacity in bits (1 greater than the index of the last bit)

public long cardinality ()

Returns
  • the number of set bits

public void clear (int startIndex, int endIndex)

Clears a range of bits. Clearing past the end does not change the size of the set.

Parameters
startIndex lower index
endIndex one-past the last bit to clear

public void clear (long index)

clears a bit, allowing access beyond the current set size without changing the size.

public void clear (long startIndex, long endIndex)

Clears a range of bits. Clearing past the end does not change the size of the set.

Parameters
startIndex lower index
endIndex one-past the last bit to clear

public Object clone ()

public void ensureCapacity (long numBits)

Ensure that the long[] is big enough to hold numBits, expanding it if necessary. getNumWords() is unchanged by this call.

public void ensureCapacityWords (int numWords)

Expand the long[] with the size given as a number of words (64 bit longs). getNumWords() is unchanged by this call.

public boolean equals (Object o)

returns true if both sets have the same bits set

public void fastClear (long index)

clears a bit. The index should be less than the OpenBitSet size.

public void fastClear (int index)

clears a bit. The index should be less than the OpenBitSet size.

public void fastFlip (int index)

flips a bit. The index should be less than the OpenBitSet size.

public void fastFlip (long index)

flips a bit. The index should be less than the OpenBitSet size.

public boolean fastGet (int index)

Returns true or false for the specified bit index. The index should be less than the OpenBitSet size

public boolean fastGet (long index)

Returns true or false for the specified bit index. The index should be less than the OpenBitSet size.

public void fastSet (long index)

Sets the bit at the specified index. The index should be less than the OpenBitSet size.

public void fastSet (int index)

Sets the bit at the specified index. The index should be less than the OpenBitSet size.

public void flip (long index)

flips a bit, expanding the set size if necessary

public void flip (long startIndex, long endIndex)

Flips a range of bits, expanding the set size if necessary

Parameters
startIndex lower index
endIndex one-past the last bit to flip

public boolean flipAndGet (int index)

flips a bit and returns the resulting bit value. The index should be less than the OpenBitSet size.

public boolean flipAndGet (long index)

flips a bit and returns the resulting bit value. The index should be less than the OpenBitSet size.

public boolean get (long index)

Returns true or false for the specified bit index

public boolean get (int index)

Returns true or false for the specified bit index.

public boolean getAndSet (int index)

Sets a bit and returns the previous value. The index should be less than the OpenBitSet size.

public boolean getAndSet (long index)

Sets a bit and returns the previous value. The index should be less than the OpenBitSet size.

public int getBit (int index)

returns 1 if the bit is set, 0 if not. The index should be less than the OpenBitSet size

public long[] getBits ()

Expert: returns the long[] storing the bits

public int getNumWords ()

Expert: gets the number of longs in the array that are in use

public int hashCode ()

public void intersect (OpenBitSet other)

this = this AND other

public static long intersectionCount (OpenBitSet a, OpenBitSet b)

Returns the popcount or cardinality of the intersection of the two sets. Neither set is modified.

public boolean intersects (OpenBitSet other)

returns true if the sets have any elements in common

public boolean isCacheable ()

This DocIdSet implementation is cacheable.

public boolean isEmpty ()

Returns true if there are no set bits

public DocIdSetIterator iterator ()

Provides a DocIdSetIterator to access the set. This implementation can return null or EMPTY_DOCIDSET.iterator() if there are no docs that match.

public int nextSetBit (int index)

Returns the index of the first set bit starting at the index specified. -1 is returned if there are no more set bits.

public long nextSetBit (long index)

Returns the index of the first set bit starting at the index specified. -1 is returned if there are no more set bits.

public void or (OpenBitSet other)

public void remove (OpenBitSet other)

Remove all elements set in other. this = this AND_NOT other

public void set (long startIndex, long endIndex)

Sets a range of bits, expanding the set size if necessary

Parameters
startIndex lower index
endIndex one-past the last bit to set

public void set (long index)

sets a bit, expanding the set size if necessary

public void setBits (long[] bits)

Expert: sets a new long[] to use as the bit storage

public void setNumWords (int nWords)

Expert: sets the number of longs in the array that are in use

public long size ()

Returns the current capacity of this set. Included for compatibility. This is *not* equal to cardinality()

public void trimTrailingZeros ()

Lowers numWords, the number of words in use, by checking for trailing zero words.

public void union (OpenBitSet other)

this = this OR other

public static long unionCount (OpenBitSet a, OpenBitSet b)

Returns the popcount or cardinality of the union of the two sets. Neither set is modified.

public void xor (OpenBitSet other)

this = this XOR other

public static long xorCount (OpenBitSet a, OpenBitSet b)

Returns the popcount or cardinality of the exclusive-or of the two sets. Neither set is modified.

Protected Methods

protected int expandingWordNum (long index)