Package com.sun.labs.minion.indexer

Provides classes that do the main work of indexing: building the term and document dictionaries and the postings file.

See:
          Description

Interface Summary
Closeable An interface for things that can be closed, but that must respect a delay to account for things that may be in use.
 

Class Summary
MetaFile A class to read and write the indexer's metafile.
 

Package com.sun.labs.minion.indexer Description

Provides classes that do the main work of indexing: building the term and document dictionaries and the postings file.

The main driver in this package is the FieldIndexStage which is the Stage in the indexing pipeline that is responsible for processing the tokens that come down the pipeline. The FieldIndexStage adds tokens to the term dictionary, document names to the document dictionary and term occurrences to the postings file. The FieldIndexStage is also responsible for building the static summaries of the documents, and putting document titles into the document dictionary.

The FieldIndexStage is given a number of megabytes that it can use to store postings data. When this limit is reached or passed, the currently stored data is dumped to a partition. This dump always occurs at the end of a file, so that a file is not spread across more than one partition.

A partition is made up of four different files, each with it's own structure and it's own class for reading/operating on the files.