The calculus of spans.

A span is a <doc,startPosition,endPosition> tuple.

The following span query operators are implemented:

  • A SpanTermQuery matches all spans containing a particular Term.
  • A SpanNearQuery matches spans which occur near one another, and can be used to implement things like phrase search (when constructed from SpanTermQueries) and inter-phrase proximity (when constructed from other SpanNearQueries).
  • A SpanOrQuery merges spans from a number of other SpanQueries.
  • A SpanNotQuery removes spans matching one SpanQuery which overlap another. This can be used, e.g., to implement within-paragraph search.
  • A SpanFirstQuery matches spans matching q whose end position is less than n. This can be used to constrain matches to the first part of the document.
In all cases, output spans are minimally inclusive. In other words, a span formed by matching a span in x and y starts at the lesser of the two starts and ends at the greater of the two ends.

For example, a span query which matches "John Kerry" within ten words of "George Bush" within the first 100 words of the document could be constructed with:

SpanQuery john   = new SpanTermQuery(new Term("content", "john"));
SpanQuery kerry  = new SpanTermQuery(new Term("content", "kerry"));
SpanQuery george = new SpanTermQuery(new Term("content", "george"));
SpanQuery bush   = new SpanTermQuery(new Term("content", "bush"));

SpanQuery johnKerry =
   new SpanNearQuery(new SpanQuery[] {john, kerry}, 0, true);

SpanQuery georgeBush =
   new SpanNearQuery(new SpanQuery[] {george, bush}, 0, true);

SpanQuery johnKerryNearGeorgeBush =
   new SpanNearQuery(new SpanQuery[] {johnKerry, georgeBush}, 10, false);

SpanQuery johnKerryNearGeorgeBushAtStart =
   new SpanFirstQuery(johnKerryNearGeorgeBush, 100);

Span queries may be freely intermixed with other Lucene queries. So, for example, the above query can be restricted to documents which also use the word "iraq" with:

Query query = new BooleanQuery();
query.add(johnKerryNearGeorgeBushAtStart, true, false);
query.add(new TermQuery("content", "iraq"), true, false);



Wrapper to allow SpanQuery objects participate in composite single-field SpanQueries by 'lying' about their search field. 

NearSpansOrdered A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them. 
NearSpansUnordered Similar to NearSpansOrdered, but for the unordered case. 
SpanFirstQuery Matches spans near the beginning of a field. 
SpanNearQuery Matches spans which are near one another. 
SpanNotQuery Removes matches which overlap with another SpanQuery. 
SpanOrQuery Matches the union of its clauses. 
SpanQuery Base class for span-based queries. 
Spans Expert: an enumeration of span matches. 
SpanScorer Public for extension only. 
SpanTermQuery Matches spans containing a term. 
SpanWeight Expert-only. 
TermSpans Expert: Public for extension only