com.sun.labs.minion.indexer.postings
Class IDFreqPostings

java.lang.Object
  extended by com.sun.labs.minion.indexer.postings.IDPostings
      extended by com.sun.labs.minion.indexer.postings.IDFreqPostings
All Implemented Interfaces:
MergeablePostings, Postings
Direct Known Subclasses:
DocumentVectorPostings

public class IDFreqPostings
extends IDPostings

A postings class for IDs that have frequencies associated with them.

The format is just like that for IDPostings, except that for each ID:

  1. The ID is byte encoded as a delta from the previous ID.
  2. The frequency is byte encoded as-is


Nested Class Summary
 class IDFreqPostings.IDFreqIterator
           
 
Nested classes/interfaces inherited from class com.sun.labs.minion.indexer.postings.IDPostings
IDPostings.IDIterator
 
Field Summary
protected  int freq
          The frequency of the current ID.
protected  int[] freqs
          The frequencies for these postings.
protected static java.lang.String logTag
           
protected  int maxfdt
          The maximum frequency.
protected  long to
          The total number of occurrences in the postings list.
 
Fields inherited from class com.sun.labs.minion.indexer.postings.IDPostings
curr, dataStart, ids, lastID, nIDs, nSkips, post, prevID, skipID, skipPos, skipSize
 
Constructor Summary
IDFreqPostings()
          Makes a postings entry that is useful during indexing.
IDFreqPostings(ReadableBuffer b)
          Makes a postings entry that is useful during querying.
IDFreqPostings(ReadableBuffer b, int offset, int size)
          Makes a postings entry that is useful during querying.
 
Method Summary
 void add(Occurrence o)
          Adds an occurrence to the postings list.
protected  int encode(int id)
          Encodes the data for the current ID, and sets up for the next one.
 void finish()
          Finishes off the encoding our data.
 int getMaxFDT()
          Gets the maximum frequency in the postings list.
 long getTotalOccurrences()
          Gets the total number of occurrences in this postings list.
 PostingsIterator iterator(PostingsIteratorFeatures features)
          Gets an iterator for the postings.
 void merge(MergeablePostings mp, int[] map)
          Merges another set of postings with this set of postings.
protected  void recodeID(int currID, int lastID, PostingsIterator pi)
          Re-encodes the data from another postings onto this one.
 
Methods inherited from class com.sun.labs.minion.indexer.postings.IDPostings
addSkip, append, append, getBuffers, getLastID, getN, remap, setSkipSize, size, skip
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

freqs

protected int[] freqs
The frequencies for these postings.


freq

protected int freq
The frequency of the current ID.


to

protected long to
The total number of occurrences in the postings list. Note that this is a long, even though the return value from getTotalOccurrences is an int. This is because, while it doesn't make any sense to return a long's worth of counts, we may collect more than an int's worth.


maxfdt

protected int maxfdt
The maximum frequency.


logTag

protected static java.lang.String logTag
Constructor Detail

IDFreqPostings

public IDFreqPostings()
Makes a postings entry that is useful during indexing.


IDFreqPostings

public IDFreqPostings(ReadableBuffer b)
Makes a postings entry that is useful during querying.

Parameters:
b - the data read from a postings file.

IDFreqPostings

public IDFreqPostings(ReadableBuffer b,
                      int offset,
                      int size)
Makes a postings entry that is useful during querying.

Parameters:
b - the data read from a postings file.
offset - The offset in the buffer from which we should start reading. If this value is greater than 0, then we need to share the bit buffer, since we may be part of a larger postings entry that will need multiple readers.
Method Detail

encode

protected int encode(int id)
Encodes the data for the current ID, and sets up for the next one.

Overrides:
encode in class IDPostings
Returns:
The number of bytes used for the encoding.

add

public void add(Occurrence o)
Adds an occurrence to the postings list.

Specified by:
add in interface Postings
Overrides:
add in class IDPostings
Parameters:
o - The occurrence to add.

finish

public void finish()
Finishes off the encoding our data.

Specified by:
finish in interface Postings
Overrides:
finish in class IDPostings

recodeID

protected void recodeID(int currID,
                        int lastID,
                        PostingsIterator pi)
Re-encodes the data from another postings onto this one. A PostingsIterator is passed in, adjusted to the current posting being encoded. This allows additional postings data about the current ID to be retrieved.

Overrides:
recodeID in class IDPostings
Parameters:
currID - The current ID
lastID - The last ID.
pi - the iterator of another postings.

merge

public void merge(MergeablePostings mp,
                  int[] map)
Description copied from interface: MergeablePostings
Merges another set of postings with this set of postings.

Specified by:
merge in interface MergeablePostings
Overrides:
merge in class IDPostings
Parameters:
mp - the postings to merge into these postings.
map - a map from IDs in the postings to IDs in the merged space.

getMaxFDT

public int getMaxFDT()
Gets the maximum frequency in the postings list.

Specified by:
getMaxFDT in interface Postings
Overrides:
getMaxFDT in class IDPostings
Returns:
1.

getTotalOccurrences

public long getTotalOccurrences()
Gets the total number of occurrences in this postings list.

Specified by:
getTotalOccurrences in interface Postings
Overrides:
getTotalOccurrences in class IDPostings
Returns:
The total number of occurrences associated with these postings.

iterator

public PostingsIterator iterator(PostingsIteratorFeatures features)
Gets an iterator for the postings.

Specified by:
iterator in interface Postings
Overrides:
iterator in class IDPostings
Parameters:
features - A set of features that the iterator must support.
Returns:
A postings iterator. The iterators for these postings only support the weighting function feature. If any extra features are requested, a warning will be logged and null will be returned.