com.sun.labs.minion.indexer.dictionary
Interface SavedField

All Superinterfaces:
java.lang.Comparable
All Known Implementing Classes:
BasicField, FeatureVector

public interface SavedField
extends java.lang.Comparable

An interface that can be implemented by various saved field types.


Method Summary
 void add(int docID, java.lang.Object data)
          Adds data to a saved field.
 void clear()
          Clears a saved field, if it's open for indexing.
 void dump(java.lang.String path, java.io.RandomAccessFile dictFile, PostingsOutput[] postOut, int maxID)
          Writes the data to the provided stream.
 QueryEntry get(java.lang.Object v, boolean caseSensitive)
          Gets a particular value from the field.
 FieldInfo getField()
          Get the field info object for this field.
 java.lang.Object getSavedData(int docID, boolean all)
          Retrieve data from a saved field.
 ArrayGroup getUndefined(ArrayGroup ag)
          Gets a group of all the documents that do not have any values saved for this field.
 DictionaryIterator iterator(java.lang.Object lowerBound, boolean includeLower, java.lang.Object upperBound, boolean includeUpper)
          Gets an iterator for the values in this field.
 void merge(java.lang.String path, SavedField[] fields, int maxID, int[] starts, int[] nUndel, int[][] docIDMaps, java.io.RandomAccessFile dictFile, PostingsOutput postOut)
          Merges a number of saved fields.
 int size()
          Gets the number of saved items that we're storing.
 
Methods inherited from interface java.lang.Comparable
compareTo
 

Method Detail

add

void add(int docID,
         java.lang.Object data)
Adds data to a saved field.

Parameters:
docID - the document ID for the document containing the saved data
data - The actual field data.

dump

void dump(java.lang.String path,
          java.io.RandomAccessFile dictFile,
          PostingsOutput[] postOut,
          int maxID)
          throws java.io.IOException
Writes the data to the provided stream.

Parameters:
path - The path of the index directory.
dictFile - The file where the dictionary will be written.
postOut - A place to write the postings associated with the values.
maxID - The maximum document ID for this partition.
Throws:
java.io.IOException - if there is an error during the writing.

get

QueryEntry get(java.lang.Object v,
               boolean caseSensitive)
Gets a particular value from the field.

Parameters:
v - The value to get.
caseSensitive - If true, case should be taken into account when iterating through the values. This value will only be observed for character fields!
Returns:
The term associated with that name, or null if that term doesn't occur in the indexed material.
Throws:
java.lang.UnsupportedOperationException - if the implementing field type does not support getting documents by value.

getField

FieldInfo getField()
Get the field info object for this field.

Returns:
the FieldInfo

getSavedData

java.lang.Object getSavedData(int docID,
                              boolean all)
Retrieve data from a saved field.

Parameters:
docID - the document ID that we want data for.
all - If true, return all known values for the field in the given document. If false return only one value.
Returns:
If all is true, then return a List of the values stored in the given field in the given document. If all is false, a single value of the appropriate type will be returned.

If the given name is not the name of a saved field, or the document ID is invalid, null will be returned.


getUndefined

ArrayGroup getUndefined(ArrayGroup ag)
Gets a group of all the documents that do not have any values saved for this field.

Parameters:
ag - a set of documents to which we should restrict the search for documents with undefined field values. If this is null then there is no such restriction.
Returns:
a set of documents that have no defined values for this field. This set may be restricted to documents occurring in the group that was passed in.

iterator

DictionaryIterator iterator(java.lang.Object lowerBound,
                            boolean includeLower,
                            java.lang.Object upperBound,
                            boolean includeUpper)
Gets an iterator for the values in this field.


size

int size()
Gets the number of saved items that we're storing.


clear

void clear()
Clears a saved field, if it's open for indexing.


merge

void merge(java.lang.String path,
           SavedField[] fields,
           int maxID,
           int[] starts,
           int[] nUndel,
           int[][] docIDMaps,
           java.io.RandomAccessFile dictFile,
           PostingsOutput postOut)
           throws java.io.IOException
Merges a number of saved fields.

Parameters:
path - The path to the index directory.
fields - An array of fields to merge.
maxID - The max doc ID in the new partition
starts - The new starting document IDs for the partitions.
nUndel - The number of undeleted documents in each partition
docIDMaps - A map for each partition from old document IDs to new document IDs. IDs that map to a value less than 0 have been deleted. A null array means that the old IDs are the new IDs.
dictFile - The file to which the merged dictionaries will be written.
postOut - The output to which the merged postings will be written.
Throws:
java.io.IOException - if there is an error during the merge.