com.sun.labs.minion.indexer.dictionary
Class DictionaryWriter

java.lang.Object
  extended by com.sun.labs.minion.indexer.dictionary.DictionaryWriter

public class DictionaryWriter
extends java.lang.Object

A class that will write a dictionary to a file. This can be used when dumping or merging dictionaries.


Field Summary
protected  DictionaryHeader dh
          A header for the dictionary we're writing.
protected  NameEncoder encoder
          An encoder for the names in our dictionary.
protected  int[] idToPosn
          A map from ID to position in the dictionary.
protected  WriteableBuffer info
          A buffer to hold term information.
protected  java.io.File infoFile
          A file to hold the temporary info buffer.
protected  WriteableBuffer infoOffsets
          A buffer to hold term information offsets.
protected  java.io.File infoOffsetsFile
          A file to hold the temporary name offsets buffer.
protected  java.io.RandomAccessFile infoOffsetsRAF
          Random access file for the temporary names offsets buffer.
protected  java.io.RandomAccessFile infoRAF
          Random access file for the temporary names buffer.
protected static java.lang.String logTag
           
protected  WriteableBuffer nameOffsets
          A buffer to hold the offsets of the uncompressed names in the merged dictionary.
protected  java.io.File nameOffsetsFile
          A file to hold the temporary name offsets buffer.
protected  java.io.RandomAccessFile nameOffsetsRAF
          Random access file for the temporary names offsets buffer.
protected  WriteableBuffer names
          A buffer to hold names.
protected  java.io.File namesFile
          A file to hold the temporary names buffer.
protected  java.io.RandomAccessFile namesRAF
          Random access file for the temporary names buffer.
protected  int nOffsets
          The number of offsets that we've encoded.
protected static int OUT_BUFFER_SIZE
           
protected  PartitionStats partStats
          A set of partition statistics for the partition who's dictionaries we're dumping/merging.
protected  java.lang.Object prevName
          The name of the previous entry added to the merged dictionary.
 
Constructor Summary
DictionaryWriter(java.lang.String path, NameEncoder encoder, PartitionStats partStats, int nChans, MemoryDictionary.Renumber renumber)
          Creates a dictionary writer that will write data to disk.
 
Method Summary
 void finish(java.io.RandomAccessFile dictFile)
          Finishes by writing the dictionary to the given file.
 void write(IndexEntry e)
          Writes an entry to the dictionary.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

dh

protected DictionaryHeader dh
A header for the dictionary we're writing.


encoder

protected NameEncoder encoder
An encoder for the names in our dictionary.


partStats

protected PartitionStats partStats
A set of partition statistics for the partition who's dictionaries we're dumping/merging. This may be null.


prevName

protected java.lang.Object prevName
The name of the previous entry added to the merged dictionary.


names

protected WriteableBuffer names
A buffer to hold names.


nameOffsets

protected WriteableBuffer nameOffsets
A buffer to hold the offsets of the uncompressed names in the merged dictionary.


info

protected WriteableBuffer info
A buffer to hold term information.


infoOffsets

protected WriteableBuffer infoOffsets
A buffer to hold term information offsets.


nOffsets

protected int nOffsets
The number of offsets that we've encoded.


namesFile

protected java.io.File namesFile
A file to hold the temporary names buffer.


namesRAF

protected java.io.RandomAccessFile namesRAF
Random access file for the temporary names buffer.


nameOffsetsFile

protected java.io.File nameOffsetsFile
A file to hold the temporary name offsets buffer.


nameOffsetsRAF

protected java.io.RandomAccessFile nameOffsetsRAF
Random access file for the temporary names offsets buffer.


infoFile

protected java.io.File infoFile
A file to hold the temporary info buffer.


infoRAF

protected java.io.RandomAccessFile infoRAF
Random access file for the temporary names buffer.


infoOffsetsFile

protected java.io.File infoOffsetsFile
A file to hold the temporary name offsets buffer.


infoOffsetsRAF

protected java.io.RandomAccessFile infoOffsetsRAF
Random access file for the temporary names offsets buffer.


idToPosn

protected int[] idToPosn
A map from ID to position in the dictionary.


logTag

protected static java.lang.String logTag

OUT_BUFFER_SIZE

protected static int OUT_BUFFER_SIZE
Constructor Detail

DictionaryWriter

public DictionaryWriter(java.lang.String path,
                        NameEncoder encoder,
                        PartitionStats partStats,
                        int nChans,
                        MemoryDictionary.Renumber renumber)
                 throws java.io.IOException
Creates a dictionary writer that will write data to disk.

Parameters:
path - The path where the temporary files should be written.
encoder - An encoder for the names of the entries.
partStats - the set of stats for this partition
nChans - The number of postings channels used by the dictionary.
renumber - A flag indicating how entries in the dictionary were renumbered during sorting. We only care about MemoryDictionary.Renumber.NONE, value which inidicates to us that we need to keep a map from entry ID to position in the dictionary.
Throws:
java.io.IOException - if there was an error writing to disk
Method Detail

write

public void write(IndexEntry e)
Writes an entry to the dictionary.


finish

public void finish(java.io.RandomAccessFile dictFile)
            throws java.io.IOException
Finishes by writing the dictionary to the given file.

Throws:
java.io.IOException