public class

ParallelReader

extends IndexReader

java.lang.Object
↳	org.apache.lucene.index.IndexReader
	↳	org.apache.lucene.index.ParallelReader

Class Overview

An IndexReader which reads multiple, parallel indexes. Each index added must have the same number of documents, but typically each contains different fields. Each document contains the union of the fields of all documents with the same document number. When searching, matches for a query term are from the first index added that has the field.

This is useful, e.g., with collections that have large fields which change rarely and small fields that change more frequently. The smaller fields may be re-indexed in a new index and both indexes may be searched together.

Warning: It is up to you to make sure all indexes are created and modified the same way. For example, if you add documents to one index, you need to add the same documents in the same order to the other indexes. Failure to do so will result in undefined behavior.

Summary

[Expand]

Inherited Fields

From class org.apache.lucene.index.IndexReader

Public Constructors
	ParallelReader() Construct a ParallelReader.
	ParallelReader(boolean closeSubReaders) Construct a ParallelReader.

Public Methods
void	add(IndexReader reader) Add an IndexReader.
void	add(IndexReader reader, boolean ignoreStoredFields) Add an IndexReader whose stored fields will not be returned.
synchronized Object	clone() Efficiently clones the IndexReader (sharing most internal state).
int	docFreq(Term term) Returns the number of documents containing the term `t`.
Document	document(int n, FieldSelector fieldSelector) Get the `Document` at the `n` ^th position.
Collection<String>	getFieldNames(IndexReader.FieldOption fieldNames) Get a list of unique field names that exist in this index and have the specified field option information.
void	getTermFreqVector(int docNumber, String field, TermVectorMapper mapper) Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the `TermFreqVector`.
TermFreqVector	getTermFreqVector(int n, String field) Return a term frequency vector for the specified document and field.
void	getTermFreqVector(int docNumber, TermVectorMapper mapper) Map all the term vectors for all fields in a Document
TermFreqVector[]	getTermFreqVectors(int n) Return an array of term frequency vectors for the specified document.
long	getVersion() Not implemented.
boolean	hasDeletions() Returns true if any documents have been deleted
boolean	hasNorms(String field) Returns true if there are norms stored for this field.
boolean	isCurrent() Checks recursively if all subreaders are up to date.
boolean	isDeleted(int n) Returns true if document n has been deleted
boolean	isOptimized() Checks recursively if all subindexes are optimized
int	maxDoc() Returns one greater than the largest possible document number.
byte[]	norms(String field) Returns the byte-encoded normalization factor for the named field of every document.
void	norms(String field, byte[] result, int offset) Reads the byte-encoded normalization factor for the named field of every document.
int	numDocs() Returns the number of documents in this index.
synchronized IndexReader	reopen() Tries to reopen the subreaders.
TermDocs	termDocs() Returns an unpositioned `TermDocs` enumerator.
TermDocs	termDocs(Term term) Returns an enumeration of all the documents which contain `term`.
TermPositions	termPositions(Term term) Returns an enumeration of all the documents which contain `term`.
TermPositions	termPositions() Returns an unpositioned `TermPositions` enumerator.
TermEnum	terms(Term term) Returns an enumeration of all terms starting at a given term.
TermEnum	terms() Returns an enumeration of all the terms in the index.

Protected Methods
synchronized void	doClose() Implements close.
void	doCommit(Map<String, String> commitUserData) Implements commit.
void	doDelete(int n) Implements deletion of the document numbered `docNum`.
IndexReader	doReopen(boolean doClone)
void	doSetNorm(int n, String field, byte value) Implements setNorm in subclass.
void	doUndeleteAll() Implements actual undeleteAll() in subclass.

[Expand]

Inherited Methods

From class org.apache.lucene.index.IndexReader

synchronized void	acquireWriteLock() Does nothing by default.
synchronized IndexReader	clone(boolean openReadOnly) Clones the IndexReader and optionally changes readOnly.
synchronized Object	clone() Efficiently clones the IndexReader (sharing most internal state).
synchronized final void	close() Closes files associated with this index.
synchronized final void	commit() Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics).
synchronized final void	commit(Map<String, String> commitUserData) Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics).
synchronized void	decRef() Expert: decreases the refCount of this IndexReader instance.
synchronized void	deleteDocument(int docNum) Deletes the document numbered `docNum`.
int	deleteDocuments(Term term) Deletes all documents that have a given `term` indexed.
Directory	directory() Returns the directory associated with this index.
abstract void	doClose() Implements close.
abstract void	doCommit(Map<String, String> commitUserData) Implements commit.
abstract void	doDelete(int docNum) Implements deletion of the document numbered `docNum`.
abstract void	doSetNorm(int doc, String field, byte value) Implements setNorm in subclass.
abstract void	doUndeleteAll() Implements actual undeleteAll() in subclass.
abstract int	docFreq(Term t) Returns the number of documents containing the term `t`.
abstract Document	document(int n, FieldSelector fieldSelector) Get the `Document` at the `n` ^th position.
Document	document(int n) Returns the stored fields of the `n`^th `Document` in this index.
final void	ensureOpen()
synchronized final void	flush(Map<String, String> commitUserData)
synchronized final void	flush()
Map<String, String>	getCommitUserData() Retrieve the String userData optionally passed to IndexWriter#commit.
static Map<String, String>	getCommitUserData(Directory directory) Reads commitUserData, previously passed to `commit(Map)`, from current index segments file.
static long	getCurrentVersion(Directory directory) Reads version number from segments files.
Object	getDeletesCacheKey() Expert.
Object	getFieldCacheKey() Expert
abstract Collection<String>	getFieldNames(IndexReader.FieldOption fldOption) Get a list of unique field names that exist in this index and have the specified field option information.
IndexCommit	getIndexCommit() Expert: return the IndexCommit that this reader has opened.
synchronized int	getRefCount() Expert: returns the current refCount for this reader
IndexReader[]	getSequentialSubReaders() Expert: returns the sequential sub readers that this reader is logically composed of.
abstract void	getTermFreqVector(int docNumber, String field, TermVectorMapper mapper) Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the `TermFreqVector`.
abstract TermFreqVector	getTermFreqVector(int docNumber, String field) Return a term frequency vector for the specified document and field.
abstract void	getTermFreqVector(int docNumber, TermVectorMapper mapper) Map all the term vectors for all fields in a Document
abstract TermFreqVector[]	getTermFreqVectors(int docNumber) Return an array of term frequency vectors for the specified document.
int	getTermInfosIndexDivisor() For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened.
long	getUniqueTermCount() Returns the number of unique terms (across all fields) in this reader.
long	getVersion() Version number when this IndexReader was opened.
abstract boolean	hasDeletions() Returns true if any documents have been deleted
boolean	hasNorms(String field) Returns true if there are norms stored for this field.
synchronized void	incRef() Expert: increments the refCount of this IndexReader instance.
static boolean	indexExists(Directory directory) Returns `true` if an index exists at the specified directory.
boolean	isCurrent() Check whether any new changes have occurred to the index since this reader was opened.
abstract boolean	isDeleted(int n) Returns true if document n has been deleted
boolean	isOptimized() Checks is the index is optimized (if it has a single segment and no deletions).
static long	lastModified(Directory directory2) Returns the time the index in the named directory was last modified.
static Collection<IndexCommit>	listCommits(Directory dir) Returns all commit points that exist in the Directory.
static void	main(String[] args) Prints the filename and size of each file within a given compound file.
abstract int	maxDoc() Returns one greater than the largest possible document number.
abstract void	norms(String field, byte[] bytes, int offset) Reads the byte-encoded normalization factor for the named field of every document.
abstract byte[]	norms(String field) Returns the byte-encoded normalization factor for the named field of every document.
int	numDeletedDocs() Returns the number of deleted documents.
abstract int	numDocs() Returns the number of documents in this index.
static IndexReader	open(Directory directory, boolean readOnly) Returns an IndexReader reading the index in the given Directory.
static IndexReader	open(IndexCommit commit, boolean readOnly) Expert: returns an IndexReader reading the index in the given `IndexCommit`.
static IndexReader	open(IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor) Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom `IndexDeletionPolicy`.
static IndexReader	open(Directory directory) Returns a IndexReader reading the index in the given Directory, with readOnly=true.
static IndexReader	open(Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly) Expert: returns an IndexReader reading the index in the given Directory, with a custom `IndexDeletionPolicy`.
static IndexReader	open(IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly) Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom `IndexDeletionPolicy`.
static IndexReader	open(Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor) Expert: returns an IndexReader reading the index in the given Directory, with a custom `IndexDeletionPolicy`.
synchronized IndexReader	reopen(IndexCommit commit) Expert: reopen this reader on a specific commit point.
synchronized IndexReader	reopen() Refreshes an IndexReader if the index has changed since this instance was (re)opened.
synchronized IndexReader	reopen(boolean openReadOnly) Just like `reopen()`, except you can change the readOnly of the original reader.
void	setNorm(int doc, String field, float value) Expert: Resets the normalization factor for the named field of the named document.
synchronized void	setNorm(int doc, String field, byte value) Expert: Resets the normalization factor for the named field of the named document.
abstract TermDocs	termDocs() Returns an unpositioned `TermDocs` enumerator.
TermDocs	termDocs(Term term) Returns an enumeration of all the documents which contain `term`.
TermPositions	termPositions(Term term) Returns an enumeration of all the documents which contain `term`.
abstract TermPositions	termPositions() Returns an unpositioned `TermPositions` enumerator.
abstract TermEnum	terms(Term t) Returns an enumeration of all terms starting at a given term.
abstract TermEnum	terms() Returns an enumeration of all the terms in the index.
synchronized void	undeleteAll() Undeletes all documents currently marked as deleted in this index.

From class java.lang.Object

From interface java.io.Closeable

Public Constructors

public ParallelReader ()

Construct a ParallelReader.

Note that all subreaders are closed if this ParallelReader is closed.

Throws

IOException

public ParallelReader (boolean closeSubReaders)

Construct a ParallelReader.

Parameters

closeSubReaders	indicates whether the subreaders should be closed when this ParallelReader is closed

Throws

IOException

Public Methods

public void add (IndexReader reader)

Add an IndexReader.

Throws

IOException	if there is a low-level IO error

public void add (IndexReader reader, boolean ignoreStoredFields)

Add an IndexReader whose stored fields will not be returned. This can accelerate search when stored fields are only needed from a subset of the IndexReaders.

Throws

IllegalArgumentException	if not all indexes contain the same number of documents
IllegalArgumentException	if not all indexes have the same value of `maxDoc()`
IOException	if there is a low-level IO error

public synchronized Object clone ()

Efficiently clones the IndexReader (sharing most internal state).

On cloning a reader with pending changes (deletions, norms), the original reader transfers its write lock to the cloned reader. This means only the cloned reader may make further changes to the index, and commit the changes to the index on close, but the old reader still reflects all changes made up until it was cloned.

Like reopen(), it's safe to make changes to either the original or the cloned reader: all shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.

public int docFreq (Term term)

Returns the number of documents containing the term t.

Throws

IOException

public Document document (int n, FieldSelector fieldSelector)

Get the Document at the n ^th position. The FieldSelector may be used to determine what Fields to load and how they should be loaded. NOTE: If this Reader (more specifically, the underlying FieldsReader) is closed before the lazy Field is loaded an exception may be thrown. If you want the value of a lazy Field to be available after closing you must explicitly load it or fetch the Document again with a new loader.

NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call isDeleted(int) with the requested document ID to verify the document is not deleted.

Parameters

n	Get the document at the `n`^th position
fieldSelector	The `FieldSelector` to use to determine what Fields should be loaded on the Document. May be null, in which case all Fields will be loaded.

Returns

The stored fields of the Document at the nth position

Throws

CorruptIndexException
IOException

public Collection<String> getFieldNames (IndexReader.FieldOption fieldNames)

Get a list of unique field names that exist in this index and have the specified field option information.

Parameters

fieldNames	specifies which field option should be available for the returned fields

Returns

Collection of Strings indicating the names of the fields.

public void getTermFreqVector (int docNumber, String field, TermVectorMapper mapper)

Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the TermFreqVector.

Parameters

docNumber	The number of the document to load the vector for
field	The name of the field to load
mapper	The `TermVectorMapper` to process the vector. Must not be null

Throws

IOException

public TermFreqVector getTermFreqVector (int n, String field)

Return a term frequency vector for the specified document and field. The returned vector contains terms and frequencies for the terms in the specified field of this document, if the field had the storeTermVector flag set. If termvectors had been stored with positions or offsets, a TermPositionVector is returned.

Parameters

n	document for which the term frequency vector is returned
field	field for which the term frequency vector is returned.

Returns

term frequency vector May be null if field does not exist in the specified document or term vector was not stored.

Throws

IOException

public void getTermFreqVector (int docNumber, TermVectorMapper mapper)

Map all the term vectors for all fields in a Document

Parameters

docNumber	The number of the document to load the vector for
mapper	The `TermVectorMapper` to process the vector. Must not be null

Throws

IOException

public TermFreqVector[] getTermFreqVectors (int n)

Return an array of term frequency vectors for the specified document. The array contains a vector for each vectorized field in the document. Each vector contains terms and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null. The term vectors that are returned may either be of type TermFreqVector or of type TermPositionVector if positions or offsets have been stored.

Parameters

n	document for which term frequency vectors are returned

Returns

array of term frequency vectors. May be null if no term vectors have been stored for the specified document.

Throws

IOException

public long getVersion ()

Not implemented.

Throws

UnsupportedOperationException

public boolean hasDeletions ()

Returns true if any documents have been deleted

public boolean hasNorms (String field)

Returns true if there are norms stored for this field.

Throws

IOException

public boolean isCurrent ()

Checks recursively if all subreaders are up to date.

Throws

CorruptIndexException
IOException

public boolean isDeleted (int n)

Returns true if document n has been deleted

public boolean isOptimized ()

Checks recursively if all subindexes are optimized

Returns

true if the index is optimized; false otherwise

public int maxDoc ()

Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.

public byte[] norms (String field)

Returns the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents.

Throws

IOException

public void norms (String field, byte[] result, int offset)

Reads the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents.

Throws

IOException

public int numDocs ()

Returns the number of documents in this index.

public synchronized IndexReader reopen ()

Tries to reopen the subreaders.
If one or more subreaders could be re-opened (i. e. subReader.reopen() returned a new instance != subReader), then a new ParallelReader instance is returned, otherwise this instance is returned.

A re-opened instance might share one or more subreaders with the old instance. Index modification operations result in undefined behavior when performed before the old instance is closed. (see reopen()).

If subreaders are shared, then the reference count of those readers is increased to ensure that the subreaders remain open until the last referring reader is closed.

Throws

CorruptIndexException	if the index is corrupt
IOException	if there is a low-level IO error

public TermDocs termDocs ()

Returns an unpositioned TermDocs enumerator.

Throws

IOException

public TermDocs termDocs (Term term)

Returns an enumeration of all the documents which contain term. For each document, the document number, the frequency of the term in that document is also provided, for use in search scoring. If term is null, then all non-deleted docs are returned with freq=1. Thus, this method implements the mapping:

^*

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

Throws

IOException

public TermPositions termPositions (Term term)

Returns an enumeration of all the documents which contain term. For each document, in addition to the document number and frequency of the term in that document, a list of all of the ordinal positions of the term in the document is available. Thus, this method implements the mapping:

₁

₂

_freq-1

^*

This positional information facilitates phrase and proximity searching.

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

Throws

IOException

public TermPositions termPositions ()

Returns an unpositioned TermPositions enumerator.

Throws

IOException

public TermEnum terms (Term term)

Returns an enumeration of all terms starting at a given term. If the given term does not exist, the enumeration is positioned at the first term greater than the supplied term. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.

Throws

IOException

public TermEnum terms ()

Returns an enumeration of all the terms in the index. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration. Note that after calling terms(), next() must be called on the resulting enumeration before calling other methods such as term().

Throws

IOException

Protected Methods

protected synchronized void doClose ()

Implements close.

Throws

IOException

protected void doCommit (Map<String, String> commitUserData)

Implements commit.

Throws

IOException

protected void doDelete (int n)

Implements deletion of the document numbered docNum. Applications should call deleteDocument(int) or deleteDocuments(Term).

Throws

CorruptIndexException
IOException

protected IndexReader doReopen (boolean doClone)

Throws

CorruptIndexException
IOException

protected void doSetNorm (int n, String field, byte value)

Implements setNorm in subclass.

Throws

CorruptIndexException
IOException

protected void doUndeleteAll ()

Implements actual undeleteAll() in subclass.

Throws

CorruptIndexException
IOException

Interfaces

Classes

Exceptions

ParallelReader

Class Overview

Summary

Public Constructors

public ParallelReader ()

Throws

public ParallelReader (boolean closeSubReaders)

Parameters

Throws

Public Methods

public void add (IndexReader reader)

Throws

public void add (IndexReader reader, boolean ignoreStoredFields)

Throws

public synchronized Object clone ()

public int docFreq (Term term)

Throws

public Document document (int n, FieldSelector fieldSelector)

Parameters

Returns

Throws

public Collection<String> getFieldNames (IndexReader.FieldOption fieldNames)

Parameters

Returns

public void getTermFreqVector (int docNumber, String field, TermVectorMapper mapper)

Parameters

Throws

public TermFreqVector getTermFreqVector (int n, String field)

Parameters

Returns

Throws

public void getTermFreqVector (int docNumber, TermVectorMapper mapper)

Parameters

Throws

public TermFreqVector[] getTermFreqVectors (int n)

Parameters

Returns

Throws

public long getVersion ()

Throws

public boolean hasDeletions ()

public boolean hasNorms (String field)

Throws

public boolean isCurrent ()

Throws

public boolean isDeleted (int n)

public boolean isOptimized ()

Returns

public int maxDoc ()

public byte[] norms (String field)

Throws

public void norms (String field, byte[] result, int offset)

Throws

public int numDocs ()

public synchronized IndexReader reopen ()

Throws

public TermDocs termDocs ()

Throws

public TermDocs termDocs (Term term)

Throws

public TermPositions termPositions (Term term)

Throws

public TermPositions termPositions ()

Throws

public TermEnum terms (Term term)

Throws

public TermEnum terms ()

Throws

Protected Methods

protected synchronized void doClose ()

Throws

protected void doCommit (Map<String, String> commitUserData)

Throws

protected void doDelete (int n)

Throws

protected IndexReader doReopen (boolean doClone)

Throws