java.lang.Object | |||
↳ | org.apache.lucene.util.AttributeSource | ||
↳ | org.apache.lucene.analysis.TokenStream | ||
↳ | org.apache.lucene.analysis.Tokenizer |
Known Direct Subclasses |
Known Indirect Subclasses |
A Tokenizer is a TokenStream whose input is a Reader.
This is an abstract class; subclasses must override incrementToken()
NOTE: Subclasses overriding incrementToken()
must
call clearAttributes()
before
setting attributes.
Fields | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
input | The text source for this Tokenizer. |
Protected Constructors | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Construct a tokenizer with null input.
| |||||||||||
Construct a token stream processing the given input.
| |||||||||||
Construct a tokenizer with null input using the given AttributeFactory.
| |||||||||||
Construct a token stream processing the given input using the given AttributeFactory.
| |||||||||||
Construct a token stream processing the given input using the given AttributeSource.
| |||||||||||
Construct a token stream processing the given input using the given AttributeSource.
|
Public Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
By default, closes the input Reader.
| |||||||||||
Expert: Reset the tokenizer to a new reader.
|
Protected Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Return the corrected offset.
|
[Expand]
Inherited Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
From class
org.apache.lucene.analysis.TokenStream
| |||||||||||
From class
org.apache.lucene.util.AttributeSource
| |||||||||||
From class
java.lang.Object
| |||||||||||
From interface
java.io.Closeable
|
Construct a tokenizer with null input.
Construct a tokenizer with null input using the given AttributeFactory.
Construct a token stream processing the given input using the given AttributeFactory.
Construct a token stream processing the given input using the given AttributeSource.
Construct a token stream processing the given input using the given AttributeSource.
Expert: Reset the tokenizer to a new reader. Typically, an analyzer (in its reusableTokenStream method) will use this to re-use a previously created tokenizer.
IOException |
---|
Return the corrected offset. If input
is a CharStream
subclass
this method calls correctOffset(int)
, else returns currentOff
.
currentOff | offset as seen in the output |
---|