Known Direct Subclasses
ASCIIFoldingFilter |
This class converts alphabetic, numeric, and symbolic Unicode characters
which are not in the first 127 ASCII characters (the "Basic Latin" Unicode
block) into their ASCII equivalents, if one exists. |
CachingTokenFilter |
This class can be used if the token attributes of a TokenStream
are intended to be consumed more than once. |
ISOLatin1AccentFilter |
This class is deprecated.
If you build a new index, use ASCIIFoldingFilter
which covers a superset of Latin 1.
This class is included for use with existing
indexes and will be removed in a future release (possibly Lucene 4.0).
|
LengthFilter |
Removes words that are too long or too short from the stream. |
LowerCaseFilter |
Normalizes token text to lower case. |
PorterStemFilter |
Transforms the token stream as per the Porter stemming algorithm. |
StandardFilter |
Normalizes tokens extracted with StandardTokenizer . |
StopFilter |
Removes stop words from a token stream. |
TeeSinkTokenFilter |
This TokenFilter provides the ability to set aside attribute states
that have already been analyzed. |
|
Class Overview
A TokenFilter is a TokenStream whose input is another TokenStream.
This is an abstract class; subclasses must override incrementToken()
.
Summary
Public Methods |
void
|
close()
Close the input TokenStream.
|
void
|
end()
Performs end-of-stream operations, if any, and calls then end() on the
input TokenStream.
|
void
|
reset()
Reset the filter as well as the input TokenStream.
|
Fields
The source of tokens for this filter.
Protected Constructors
protected
TokenFilter
(TokenStream input)
Construct a token stream filtering the given input.
Public Methods
public
void
close
()
Close the input TokenStream.
public
void
end
()
Performs end-of-stream operations, if any, and calls then end()
on the
input TokenStream.
NOTE: Be sure to call
super.end()
first when overriding this method.
public
void
reset
()
Reset the filter as well as the input TokenStream.