Package com.sun.labs.minion.document

Provides some basic document analysis for different types of text documents.

See:
          Description

Class Summary
MarkUpAnalyzer An abstract class intended to be the superclass of all mark-up analyzers.
MarkUpAnalyzer_html A MarkUpAnalyzer for HTML.
MarkUpAnalyzer_txt  
MarkUpAnalyzer_xml A wrapper around an XML analyzer that we can use in our typical context.
XMLAnalyzer An implementation of a SAX XML handler that we can use to parse one or more XML documents.
 

Package com.sun.labs.minion.document Description

Provides some basic document analysis for different types of text documents.

The document package contains analyzers for marked up text. These analyzers are invoked if a matching type of text is encountered. For example, if an IndexableString that's type is set to IndexableString.Type.HTML is passed into the engine, the MarkUpAnalyzer_html will automatically be used to recognize certain fields and automatically index those fields separately.