com.sun.labs.minion.indexer.postings
Interface Postings

All Known Implementing Classes:
ClusterPostings, DFOPostings, DocumentVectorPostings, FeaturePostings, FieldedDocumentVectorPostings, IDFreqPostings, IDPostings, IDWPostings

public interface Postings

An interface for the postings associated with a term in a dictionary.


Method Summary
 void add(Occurrence o)
          Adds an occurrence to the postings list.
 void append(Postings p, int start)
          Appends another set of postings to this one.
 void append(Postings p, int start, int[] idMap)
          Appends another set of postings to this one, removing any data associated with deleted documents.
 void finish()
          Finishes any ongoing encoding and prepares for the data to be dumped.
 WriteableBuffer[] getBuffers()
          Gets a number of Buffers whose contents represent the postings.
 int getLastID()
          Gets the last ID in the postings list.
 int getMaxFDT()
          Gets the maximum frequency in the postings associated with this entry.
 int getN()
          Gets the number of IDs in the postings list.
 long getTotalOccurrences()
          Gets the total number of occurrences associated with this set of postings.
 PostingsIterator iterator(PostingsIteratorFeatures features)
          Gets an iterator for the postings that satisfies a given set of features.
 void remap(int[] idMap)
          Remaps the IDs in this postings list according to the given old-to-new ID map.
 void setSkipSize(int size)
          Sets the skip size used for building the skip table.
 int size()
          Gets the size of the postings, in bytes.
 

Method Detail

setSkipSize

void setSkipSize(int size)
Sets the skip size used for building the skip table. A larger number will result in more IDs being encoded per skip.


add

void add(Occurrence o)
Adds an occurrence to the postings list.

Parameters:
o - The occurrence.

getN

int getN()
Gets the number of IDs in the postings list.


getLastID

int getLastID()
Gets the last ID in the postings list.


getTotalOccurrences

long getTotalOccurrences()
Gets the total number of occurrences associated with this set of postings. This is useful when a single postings entry may comprise multiple occurrences.

Returns:
The total number of occurrences associated with these postings.

getMaxFDT

int getMaxFDT()
Gets the maximum frequency in the postings associated with this entry.

Returns:
the maximum frequency across all of the postings stored in this postings list.

finish

void finish()
Finishes any ongoing encoding and prepares for the data to be dumped.


size

int size()
Gets the size of the postings, in bytes.


getBuffers

WriteableBuffer[] getBuffers()
Gets a number of Buffers whose contents represent the postings. These buffers can be written to disk.

This method must ensure that all of the data used by the entry is properly handled by the time that the method returns. This method will be called by a dictionary when it is ready to dump the postings data to a stream.

Returns:
An array of Buffers containing the postings data. All of the data in these buffers must be written to the postings file!

remap

void remap(int[] idMap)
Remaps the IDs in this postings list according to the given old-to-new ID map.

Parameters:
idMap - A map from the IDs currently in use in the postings to new IDs.

append

void append(Postings p,
            int start)
Appends another set of postings to this one.

Parameters:
p - The postings to append. Implementers can safely assume that the postings being passed in are of the same class as the implementing class.
start - The new starting document ID for the partition that the entry was drawn from.

append

void append(Postings p,
            int start,
            int[] idMap)
Appends another set of postings to this one, removing any data associated with deleted documents.

Parameters:
p - The postings to append. Implementers can safely assume that the postings being passed in are of the same class as the implementing class.
start - The new starting document ID for the partition that the entry was drawn from.
idMap - A map from old IDs in the given postings to new IDs with gaps removed for deleted data. If this is null, then there are no deleted documents.

iterator

PostingsIterator iterator(PostingsIteratorFeatures features)
Gets an iterator for the postings that satisfies a given set of features.

Parameters:
features - A set of features that the iterator must support. Note that all implementations of this interface must be able to handle a null value for this parameter! When a null is returned, the implementing class can either: return an iterator that provides some default behavior or return null.
Returns:
A postings iterator that supports the given features. If the underlying postings do not support a specified feature, then a warning should be logged and null will be returned.