Package com.sun.labs.minion.pipeline

Provides the classes for building a pipeline.

See:
          Description

Interface Summary
Stage  
 

Class Summary
AbstractPipelineImpl An abstract implementation of pipeline.
AsyncPipelineImpl A class that encapsulates the machinery of a single indexing pipeline.
BlurbStage This stages removes certain stop-like words from the review portion of a book document.
DropNumbersStage Drops any tokens that parse as integers.
Dropper A class that just drops things, but keeps track of the amount of data processed.
HLPipelineImpl A pipeline that can be used for highlighting documents.
LowerCaseStage  
PipelineFactory A configurable factor class for pipelines.
PrintStage  
PrintTokenStage  
QuestioningStage A Stage that tries to identify the questions within a document, and then sends the tokens that form the questions down to the next stage delimited as a field.
ReplacementStage A stage that replaces some words with others.
StageAdapter An adapter class for the stage interface, for those who don't want to bother with implementing methods that they don't care about.
StatStage  
StemStage  
StopWords A configurable set of stop words.
StopWordsStage This stage provides the ability to remove stop words from the token stream.
SyncPipelineImpl  
Token A class encapsulating all of our knowledge about a given token.
TokenCollectorStage A stage that collects the tokens that come down the pipe into an array that someone can retrieve.
 

Package com.sun.labs.minion.pipeline Description

Provides the classes for building a pipeline. A pipeline consists of Stages that are connected together in order to provide processing of data.

The pipelines that are built using stages are push pipelines. By that we mean that data is pushed from "upstream" stages to "downstream" stages.

Generally speaking, pipelines must be built in reverse order, since upstream stages are passed their downstream stages as arguments to their constructors.

The data that are passed down a pipeline are defined by the DocumentEvent class. Note, however that we don't actually create and pass down instances of this object. Rather, we pass along the components of these objects, in order to avoid the object creation overhead.