com.sun.labs.minion.retrieval
Class HighlightStage

java.lang.Object
  extended by com.sun.labs.minion.pipeline.StageAdapter
      extended by com.sun.labs.minion.retrieval.HighlightStage
All Implemented Interfaces:
Stage, PipelineStage, com.sun.labs.util.props.Component, com.sun.labs.util.props.Configurable

public class HighlightStage
extends StageAdapter

A pipeline stage that can be used to collect tokens from a defined set of fields that can be used for passage and field highlighting.


Field Summary
protected  java.util.Set<java.lang.String> bodyFields
          The set of fields that we're considering as body fields for the purposes of highlighting.
protected  int doc
          The ID of the document that we're highlighting.
protected  java.util.Set<java.lang.String> fields
          The fields that we're currently working on.
protected  java.util.Map hPass
          A map from field names that we're supposed to highlight to a list of the passages that should be highlighted for that field.
protected static java.lang.String logTag
           
protected  java.util.Map pass
          A map from all field names to the passages that are stored for each field.
protected  java.lang.String[] qt
          The query terms used in the query.
protected  boolean sortBody
          Whether the body fields are to be sorted by score.
 
Fields inherited from class com.sun.labs.minion.pipeline.StageAdapter
downstream, name
 
Constructor Summary
HighlightStage()
           
 
Method Summary
 void addField(java.lang.String fieldName, Passage.Type type, int context, int maxSize, boolean doSort)
          Adds a field to the map of fields that we're supposed to look for.
 void addRemaining(int context, int maxSize, boolean doSort)
          Adds any passages for fields that were not explicitly added to the null key.
protected  void addToAll(java.lang.Object key, Token t)
          Adds a token to all of the things associated with a given field name.
 void endField(FieldInfo fi)
          Removes the field from our set of fields.
 java.util.Map getPassages()
           
 void punctuation(Token p)
          Processes some punctuation from further up the pipeline.
 void reset(ArrayGroup ag, int doc, java.lang.String[] qt)
          Resets the stage so that we can use it for a different document.
 void resetPassages()
          Resets the passages for this document.
 void startField(FieldInfo fi)
          Starts a field.
 void token(Token t)
          Processes a token from further up the pipeline.
 
Methods inherited from class com.sun.labs.minion.pipeline.StageAdapter
defineField, dump, endDocument, getDownstream, getName, newProperties, savedData, setDownstream, setName, shutdown, startDocument, text
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

pass

protected java.util.Map pass
A map from all field names to the passages that are stored for each field. This map contains all defined passages in the document.


doc

protected int doc
The ID of the document that we're highlighting.


hPass

protected java.util.Map hPass
A map from field names that we're supposed to highlight to a list of the passages that should be highlighted for that field.


fields

protected java.util.Set<java.lang.String> fields
The fields that we're currently working on.


bodyFields

protected java.util.Set<java.lang.String> bodyFields
The set of fields that we're considering as body fields for the purposes of highlighting.


sortBody

protected boolean sortBody
Whether the body fields are to be sorted by score.


qt

protected java.lang.String[] qt
The query terms used in the query.


logTag

protected static java.lang.String logTag
Constructor Detail

HighlightStage

public HighlightStage()
Method Detail

reset

public void reset(ArrayGroup ag,
                  int doc,
                  java.lang.String[] qt)
Resets the stage so that we can use it for a different document.

Parameters:
ag - An array group containing the query results.
doc - The document ID that we're processing.
qt - The query terms.

resetPassages

public void resetPassages()
Resets the passages for this document.


addField

public void addField(java.lang.String fieldName,
                     Passage.Type type,
                     int context,
                     int maxSize,
                     boolean doSort)
Adds a field to the map of fields that we're supposed to look for.

Parameters:
fieldName - The name of the field that we want to collect passages for. If the name is NonField, then the other parameters specify the data for passages that do not occur in any field.
type - The type of passage to build. If this is com.sun.labs.minion.JOIN_PASSAGES, then all hits within the named field will be joined into a single passage. If this is com.sun.labs.minion.UNIQUE_PASSAGES, then each hit will be a separate passage.
context - The size of the surrounding context to put in the passage, in words. -1 means take the entire containing field.
maxSize - The maximum size of passage to return, in characters. -1 means any size is OK.
doSort - If true, then the passages for this field will be sorted by score before being returned.

addRemaining

public void addRemaining(int context,
                         int maxSize,
                         boolean doSort)
Adds any passages for fields that were not explicitly added to the null key.


getPassages

public java.util.Map getPassages()

startField

public void startField(FieldInfo fi)
Starts a field. This simply puts it in the set of fields that we're handling.

Specified by:
startField in interface Stage
Specified by:
startField in interface PipelineStage
Overrides:
startField in class StageAdapter
Parameters:
fi - The FieldInfo object that describes the field that is starting.

token

public void token(Token t)
Processes a token from further up the pipeline.

Specified by:
token in interface Stage
Overrides:
token in class StageAdapter
Parameters:
t - The token to process.

addToAll

protected void addToAll(java.lang.Object key,
                        Token t)
Adds a token to all of the things associated with a given field name.


punctuation

public void punctuation(Token p)
Processes some punctuation from further up the pipeline.

Specified by:
punctuation in interface Stage
Overrides:
punctuation in class StageAdapter
Parameters:
p - The punctuation to process.

endField

public void endField(FieldInfo fi)
Removes the field from our set of fields.

Specified by:
endField in interface Stage
Specified by:
endField in interface PipelineStage
Overrides:
endField in class StageAdapter
Parameters:
fi - The FieldInfo object that describes the field that is ending.