public class

DefaultSimilarity

extends Similarity
java.lang.Object
   ↳ org.apache.lucene.search.Similarity
     ↳ org.apache.lucene.search.DefaultSimilarity

Class Overview

Expert: Default scoring implementation.

Summary

[Expand]
Inherited Constants
From class org.apache.lucene.search.Similarity
Fields
protected boolean discountOverlaps
Public Constructors
DefaultSimilarity()
Public Methods
float computeNorm(String field, FieldInvertState state)
Implemented as state.getBoost()*lengthNorm(numTerms), where numTerms is getLength() if setDiscountOverlaps(boolean) is false, else it's getLength() - getNumOverlap().
float coord(int overlap, int maxOverlap)
Implemented as overlap / maxOverlap.
boolean getDiscountOverlaps()
float idf(int docFreq, int numDocs)
Implemented as log(numDocs/(docFreq+1)) + 1.
float lengthNorm(String fieldName, int numTerms)
Implemented as 1/sqrt(numTerms).
float queryNorm(float sumOfSquaredWeights)
Implemented as 1/sqrt(sumOfSquaredWeights).
void setDiscountOverlaps(boolean v)
Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm.
float sloppyFreq(int distance)
Implemented as 1 / (distance + 1).
float tf(float freq)
Implemented as sqrt(freq).
[Expand]
Inherited Methods
From class org.apache.lucene.search.Similarity
From class java.lang.Object

Fields

protected boolean discountOverlaps

Public Constructors

public DefaultSimilarity ()

Public Methods

public float computeNorm (String field, FieldInvertState state)

Implemented as state.getBoost()*lengthNorm(numTerms), where numTerms is getLength() if setDiscountOverlaps(boolean) is false, else it's getLength() - getNumOverlap().

WARNING: This API is new and experimental, and may suddenly change.

Parameters
field field name
state current processing state for this field
Returns
  • the calculated float norm

public float coord (int overlap, int maxOverlap)

Implemented as overlap / maxOverlap.

Parameters
overlap the number of query terms matched in the document
maxOverlap the total number of terms in the query
Returns
  • a score factor based on term overlap with the query

public boolean getDiscountOverlaps ()

public float idf (int docFreq, int numDocs)

Implemented as log(numDocs/(docFreq+1)) + 1.

Parameters
docFreq the number of documents which contain the term
numDocs the total number of documents in the collection
Returns
  • a score factor based on the term's document frequency

public float lengthNorm (String fieldName, int numTerms)

Implemented as 1/sqrt(numTerms).

Parameters
fieldName the name of the field
numTerms the total number of tokens contained in fields named fieldName of doc.
Returns
  • a normalization factor for hits on this field of this document

public float queryNorm (float sumOfSquaredWeights)

Implemented as 1/sqrt(sumOfSquaredWeights).

Parameters
sumOfSquaredWeights the sum of the squares of query term weights
Returns
  • a normalization factor for query weights

public void setDiscountOverlaps (boolean v)

Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm. By default this is false, meaning overlap tokens are counted just like non-overlap tokens.

WARNING: This API is new and experimental, and may suddenly change.

public float sloppyFreq (int distance)

Implemented as 1 / (distance + 1).

Parameters
distance the edit distance of this sloppy phrase match
Returns
  • the frequency increment for this match

public float tf (float freq)

Implemented as sqrt(freq).

Parameters
freq the frequency of a term within a document
Returns
  • a score factor based on a term's within-document frequency