Class Overview
Loader for text files that represent a list of stopwords.
Summary
Public Methods |
static
HashMap<String, String>
|
getStemDict(File wordstemfile)
Reads a stem dictionary.
|
static
HashSet<String>
|
getWordSet(Reader reader, String comment)
Reads lines from a Reader and adds every non-comment line as an entry to a HashSet (omitting
leading and trailing whitespace).
|
static
HashSet<String>
|
getWordSet(Reader reader)
Reads lines from a Reader and adds every line as an entry to a HashSet (omitting
leading and trailing whitespace).
|
static
HashSet<String>
|
getWordSet(File wordfile)
Loads a text file and adds every line as an entry to a HashSet (omitting
leading and trailing whitespace).
|
static
HashSet<String>
|
getWordSet(File wordfile, String comment)
Loads a text file and adds every non-comment line as an entry to a HashSet (omitting
leading and trailing whitespace).
|
[Expand]
Inherited Methods |
From class
java.lang.Object
Object
|
clone()
|
boolean
|
equals(Object arg0)
|
void
|
finalize()
|
final
Class<?>
|
getClass()
|
int
|
hashCode()
|
final
void
|
notify()
|
final
void
|
notifyAll()
|
String
|
toString()
|
final
void
|
wait()
|
final
void
|
wait(long arg0, int arg1)
|
final
void
|
wait(long arg0)
|
|
Public Constructors
Public Methods
Reads a stem dictionary. Each line contains:
word\tstem
(i.e. two tab seperated words)
Returns
- stem dictionary that overrules the stemming algorithm
Reads lines from a Reader and adds every non-comment line as an entry to a HashSet (omitting
leading and trailing whitespace). Every line of the Reader should contain only
one word. The words need to be in lowercase if you make use of an
Analyzer which uses LowerCaseFilter (like StandardAnalyzer).
Parameters
reader
| Reader containing the wordlist |
comment
| The string representing a comment. |
Returns
- A HashSet with the reader's words
Reads lines from a Reader and adds every line as an entry to a HashSet (omitting
leading and trailing whitespace). Every line of the Reader should contain only
one word. The words need to be in lowercase if you make use of an
Analyzer which uses LowerCaseFilter (like StandardAnalyzer).
Parameters
reader
| Reader containing the wordlist |
Returns
- A HashSet with the reader's words
public
static
HashSet<String>
getWordSet
(File wordfile)
Loads a text file and adds every line as an entry to a HashSet (omitting
leading and trailing whitespace). Every line of the file should contain only
one word. The words need to be in lowercase if you make use of an
Analyzer which uses LowerCaseFilter (like StandardAnalyzer).
Parameters
wordfile
| File containing the wordlist |
Returns
- A HashSet with the file's words
public
static
HashSet<String>
getWordSet
(File wordfile, String comment)
Loads a text file and adds every non-comment line as an entry to a HashSet (omitting
leading and trailing whitespace). Every line of the file should contain only
one word. The words need to be in lowercase if you make use of an
Analyzer which uses LowerCaseFilter (like StandardAnalyzer).
Parameters
wordfile
| File containing the wordlist |
comment
| The comment string to ignore |
Returns
- A HashSet with the file's words