com.sun.labs.minion.pipeline
Class HLPipelineImpl

java.lang.Object
  extended by com.sun.labs.minion.pipeline.AbstractPipelineImpl
      extended by com.sun.labs.minion.pipeline.SyncPipelineImpl
          extended by com.sun.labs.minion.pipeline.HLPipelineImpl
All Implemented Interfaces:
HLPipeline, PassageBuilder, Pipeline, SimpleIndexer

public class HLPipelineImpl
extends SyncPipelineImpl
implements HLPipeline

A pipeline that can be used for highlighting documents.


Field Summary
 
Fields inherited from class com.sun.labs.minion.pipeline.SyncPipelineImpl
inDoc, simpleIndexingFinished
 
Fields inherited from class com.sun.labs.minion.pipeline.AbstractPipelineImpl
currKey, d64, docSize, dumper, engine, head, logTag, pipeline, text
 
Constructor Summary
HLPipelineImpl(PipelineFactory factory, SearchEngine engine, java.util.List<Stage> pipeline)
           
 
Method Summary
 void addPassageField(java.lang.String fieldName)
          Registers a field for which we would like to get passages.
 void addPassageField(java.lang.String fieldName, Passage.Type type, int context, int maxSize, boolean doSort)
          Registers the parameters for a field for which we would like to get passages.
 java.util.Map getPassages(java.util.Map document)
          Gets the highlighted passages that were specified using addPassageField.
protected  java.util.Map getPassages(java.util.Map document, boolean addRemaining, int context, int maxSize, boolean doSort)
          Gets the highlighted passages that were specified using addPassageField.
 java.util.List getPassages(java.util.Map document, int context, int maxSize)
          Gets all of the passages in the document as a list.
 java.util.Map getPassages(java.util.Map document, int context, int maxSize, boolean doSort)
          Gets the highlighted passages that were specified using addPassageField.
 void reset(ArrayGroup ag, int doc, java.lang.String[] qt)
          Sets up for processing a new document.
 
Methods inherited from class com.sun.labs.minion.pipeline.SyncPipelineImpl
addField, addField, addField, addField, addField, addField, addField, addField, addField, addFieldInternal, addTerm, addTerm, addTerm, dump, endDocument, finish, flush, index, indexDocument, indexDocument, inDocCheck, purge, shutdown, startDocument, stateCheck
 
Methods inherited from class com.sun.labs.minion.pipeline.AbstractPipelineImpl
addImpliedField, getEngine, getHead, getIndexer, handleField, handleField, indexDoc, isIndexed, realDump, realPurge, setIndexer
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.sun.labs.minion.Pipeline
dump, flush, getHead, index, purge, shutdown
 
Methods inherited from interface com.sun.labs.minion.SimpleIndexer
isIndexed
 

Constructor Detail

HLPipelineImpl

public HLPipelineImpl(PipelineFactory factory,
                      SearchEngine engine,
                      java.util.List<Stage> pipeline)
Method Detail

reset

public void reset(ArrayGroup ag,
                  int doc,
                  java.lang.String[] qt)
Sets up for processing a new document.

Specified by:
reset in interface HLPipeline
Parameters:
ag - the raw query results
doc - the ID of the document that we want to highlight
qt - the query terms that we'll highlight

addPassageField

public void addPassageField(java.lang.String fieldName)
Description copied from interface: PassageBuilder
Registers a field for which we would like to get passages. Any passages from this field will be joined together into a single spanning passage, the entire field will be retained as context, and there will be no maximum size imposed on the passages.

This is mostly a convenience for highlighting all of the passages in a field that is expected to be small (say, like the subject of an email message.)

Specified by:
addPassageField in interface PassageBuilder
Parameters:
fieldName - The name of the field that we want to collect passages for. If this name is null, the other parameters specify the data for anything that is not in one of the fields added using addPassageField. If the name is NonField, then the other parameters specify the data for passages that do not occur in any field.

addPassageField

public void addPassageField(java.lang.String fieldName,
                            Passage.Type type,
                            int context,
                            int maxSize,
                            boolean doSort)
Description copied from interface: PassageBuilder
Registers the parameters for a field for which we would like to get passages.

Specified by:
addPassageField in interface PassageBuilder
Parameters:
fieldName - The name of the field that we want to collect passages for. If this name is null, the other parameters specify the data for anything that is not in one of the fields added using addPassageField. If the name is NonField, then the other parameters specify the data for passages that do not occur in any field.
type - The type of passage to build. If this is JOIN, then all hits within the named field will be joined into a single passage. If this is UNIQUE, then each hit will be a separate passage.
context - The size of the surrounding context to put in the passage, in words. -1 means take the entire containing field.
maxSize - The maximum size of passage to return, in characters. -1 means any size is OK.
doSort - If true, then the passages for this field will be sorted by score before being returned.

getPassages

public java.util.Map getPassages(java.util.Map document)
Gets the highlighted passages that were specified using addPassageField.

Specified by:
getPassages in interface PassageBuilder
Parameters:
document - A map representing a list of field names and values.
Returns:
A Map that maps from field names to a List of instances of Passage that are associated with the field. The key null maps to passages that did not occur in any field.
See Also:
SearchEngine.index(java.lang.String, java.util.Map), addPassageField(java.lang.String)

getPassages

public java.util.Map getPassages(java.util.Map document,
                                 int context,
                                 int maxSize,
                                 boolean doSort)
Gets the highlighted passages that were specified using addPassageField.

Specified by:
getPassages in interface PassageBuilder
Parameters:
document - A map representing a list of field names and values.
context - the amount of context that will be used around passages in fields that were not explicitly added.
maxSize - the maximum size of passage to return, in characters, for fields that were not explicitly added. -1 means any size is OK.
doSort - If true, passages from any fields not explictly added will be sorted by score before being returned.
Returns:
A Map that maps from field names to a List of instances of Passage that are associated with the field. The key null maps to passages that did not occur in any field and to passages from fields that were not explicitly added.
See Also:
SearchEngine.index(java.lang.String, java.util.Map), addPassageField(java.lang.String)

getPassages

protected java.util.Map getPassages(java.util.Map document,
                                    boolean addRemaining,
                                    int context,
                                    int maxSize,
                                    boolean doSort)
Gets the highlighted passages that were specified using addPassageField.

Parameters:
document - A map representing a list of field names and values.
addRemaining - If true, any passages that were not explicitly added will be added under the NonField key.
context - If addRemaining is true, this is the amount of context that will be used around passages in fields that were not explicitly added.
maxSize - The maximum size of passage to return, in characters. -1 means any size is OK.
doSort - If true, passages from remaining fields will be sorted by score before being returned.
Returns:
null if there are no passages associated with this hit or if we could not parse the document, or a Map that maps from field names to a List of the passages associated with that field, sorted by increasing penalty score.
See Also:
SearchEngine.index(java.lang.String, java.util.Map), addPassageField(java.lang.String)

getPassages

public java.util.List getPassages(java.util.Map document,
                                  int context,
                                  int maxSize)
Gets all of the passages in the document as a list.

Specified by:
getPassages in interface PassageBuilder
Parameters:
document - A map representing a list of field names and values.
context - The size of the surrounding context to put in the passage, in words. -1 means take an entire field value as context.
maxSize - The maximum size of passage to return, in characters. -1 means any size is OK.
Returns:
a list of Passages.