|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object com.sun.labs.minion.IndexConfig
public class IndexConfig
A class that holds configuration data for indexing documents.
Field Summary | |
---|---|
protected java.lang.String |
configName
Name used by configuration manager |
protected FieldInfo |
defaultField
The exemplar field info to use when encountering an unknown field name during indexing. |
protected boolean |
enableFeatureBackoff
Turns on the ability to back off the number of features to try to make a better classifier. |
protected java.util.Map<java.lang.String,FieldInfo> |
fieldInfo
A map from field names to the field information for those names. |
protected java.lang.String |
indexDir
The directory that holds the index. |
protected java.lang.String |
indexName
The symbolic name of the collection. |
protected int |
kFoldSplitterNumFolds
The number of folds that should be made when the k-fold splitter is being used. |
protected Lexicon |
lexicon
The lexicon. |
protected java.lang.String |
lexiconFile
The file that contains the lexicon |
protected static java.lang.String |
logTag
A tag that will be used for log entries |
static java.lang.String |
PROP_DEFAULT_FIELD_INFO
A property that names the default field information to use when encountering an unknown field during indexing. |
static java.lang.String |
PROP_ENABLE_FEATURE_BACKOFF
The property indicating whether we should attempt to do feature backoff during classification. |
static java.lang.String |
PROP_FIELD_INFO
The property that contains a list of the names of the field information objects that this index should contain. |
static java.lang.String |
PROP_INDEX_DIRECTORY
The property for the name of the index directory. |
static java.lang.String |
PROP_INDEX_NAME
The property for the symbolic name of the index that we're using. |
static java.lang.String |
PROP_KFOLD_SPLITTER_NUMFOLDS
The property for the number of folds to use when doing k-fold cross validation when building classifiers. |
static java.lang.String |
PROP_LEXICON_LOCATION
The property that names the location of the lexicon. |
static java.lang.String |
PROP_RANDOM_SPLITTER_NUMSPLITS
The property for the number of random splits to use when doing validation using a random splitter when building classifiers. |
static java.lang.String |
PROP_STORE_CLASSIFIER_SCORES
The property indicating whether we should store the scores associated with classifiers when a new document is successfully classified. |
static java.lang.String |
PROP_STORE_NON_CLASSIFIED
|
static java.lang.String |
PROP_TAXONOMY_ENABLED
The property indicating for whether the taxonomy should be enabled? |
protected int |
randomSplitterNumSplits
The number of random splits that should be made when the random splitter is being used. |
protected boolean |
storeClassifierScores
Whether to store classifier scores in per-class saved fields. |
protected boolean |
storeNonClassified
Whether to store the results of failed classifications. |
protected boolean |
taxonomyEnabled
Flag to indicate if to enable the taxonomy Independent of whether a lexicon is specified |
Constructor Summary | |
---|---|
IndexConfig()
Default constructor. |
|
IndexConfig(java.lang.String indexDir)
Creates an index configuration for a given directory, using all of the default values. |
Method Summary | |
---|---|
FieldInfo |
getDefaultFieldInfo(java.lang.String name)
Gets the field information to use for an unknown field. |
boolean |
getDoFeatureBackoff()
|
java.util.Map<java.lang.String,FieldInfo> |
getFieldInfo()
Gets the map from field names to field information objects. |
java.lang.String |
getIndexDirectory()
Gets the index directory. |
java.lang.String |
getIndexName()
Gets the name of the index. |
int |
getKFoldSplitterNumFolds()
|
Lexicon |
getLexicon()
|
java.lang.String |
getLexiconFile()
|
java.lang.String |
getName()
|
int |
getRandomSplitterNumSplits()
|
void |
newProperties(com.sun.labs.util.props.PropertySheet ps)
Creates an indexing configuration from a property sheet described in an external XML file. |
void |
setConfigurationManager(com.sun.labs.util.props.ConfigurationManager cm)
Set the configuration manager if you'd like the other mutator methods in this class to make changes to the actual configuration (so that the changes will be made available for saving) |
void |
setDefaultFieldInfo(FieldInfo fieldInfo)
Sets the field information to use when encountering unknown fields during indexing. |
boolean |
storeClassifierScores()
|
boolean |
storeNonClassified()
|
boolean |
taxonomyEnabled()
Should we we using a taxonomy? |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected java.lang.String configName
@ConfigString public static final java.lang.String PROP_INDEX_DIRECTORY
@ConfigString(mandatory=false) public static final java.lang.String PROP_INDEX_NAME
@ConfigInteger(defaultValue=10) public static final java.lang.String PROP_RANDOM_SPLITTER_NUMSPLITS
@ConfigInteger(defaultValue=10) public static final java.lang.String PROP_KFOLD_SPLITTER_NUMFOLDS
@ConfigBoolean(defaultValue=true) public static final java.lang.String PROP_ENABLE_FEATURE_BACKOFF
true
, then the
system will attempt to reduce the number of features that will be used
to build the classifiers to see if doing that improves the classification
performance on the test data.
@ConfigBoolean(defaultValue=false) public static final java.lang.String PROP_STORE_CLASSIFIER_SCORES
@ConfigBoolean(defaultValue=false) public static final java.lang.String PROP_STORE_NON_CLASSIFIED
@ConfigComponentList(type=FieldInfo.class) public static final java.lang.String PROP_FIELD_INFO
@ConfigString(mandatory=false) public static final java.lang.String PROP_LEXICON_LOCATION
@ConfigBoolean(defaultValue=false) public static final java.lang.String PROP_TAXONOMY_ENABLED
@ConfigComponent(type=FieldInfo.class) public static final java.lang.String PROP_DEFAULT_FIELD_INFO
protected java.lang.String indexDir
protected java.lang.String indexName
protected java.util.Map<java.lang.String,FieldInfo> fieldInfo
protected FieldInfo defaultField
protected Lexicon lexicon
protected java.lang.String lexiconFile
protected int randomSplitterNumSplits
protected int kFoldSplitterNumFolds
protected boolean enableFeatureBackoff
protected boolean storeClassifierScores
protected boolean storeNonClassified
protected boolean taxonomyEnabled
protected static java.lang.String logTag
Constructor Detail |
---|
public IndexConfig()
public IndexConfig(java.lang.String indexDir)
indexDir
- the directory where the index will beMethod Detail |
---|
public void setConfigurationManager(com.sun.labs.util.props.ConfigurationManager cm)
cm
- the configuration manager that created this index configpublic java.lang.String getIndexDirectory()
public java.lang.String getIndexName()
public java.util.Map<java.lang.String,FieldInfo> getFieldInfo()
public java.lang.String getLexiconFile()
public int getRandomSplitterNumSplits()
public int getKFoldSplitterNumFolds()
public boolean getDoFeatureBackoff()
public boolean storeClassifierScores()
public boolean storeNonClassified()
public void newProperties(com.sun.labs.util.props.PropertySheet ps) throws com.sun.labs.util.props.PropertyException
newProperties
in interface com.sun.labs.util.props.Configurable
ps
- the property sheet containing the properties.
com.sun.labs.util.props.PropertyException
- if there is any error processing the provided propertiespublic Lexicon getLexicon()
public java.lang.String getName()
public boolean taxonomyEnabled()
public void setDefaultFieldInfo(FieldInfo fieldInfo)
fieldInfo
- an exemplar field information object containing the
attributes and type to use when encountering unknown fields during indexing.public FieldInfo getDefaultFieldInfo(java.lang.String name)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |