com.sun.labs.minion.classification
Interface FeatureSelector

All Known Implementing Classes:
ContingencyFeatureSelector, CSFeatureSelector, FastContingencyFeatureSelector, MIFeatureSelector, SimpleFeatureSelector, WeightedFeatureSelector

public interface FeatureSelector

Selects terms from a given document or set of documents, relative to the collection the terms are part of.


Method Summary
 FeatureClusterSet select(FeatureClusterSet set, WeightingComponents wc, int numTrainingDocs, int numFeatures, SearchEngine engine)
          Selects the features from the documents in the training set.
 void setHumanSelected(HumanSelected hs)
          Provides a set of human selected terms that should be included or excluded from consideration during the feature selection process.
 void setStopWords(StopWords stopWords)
          Sets a stopword list: words that should be ignored when selecting features.
 

Method Detail

setHumanSelected

void setHumanSelected(HumanSelected hs)
Provides a set of human selected terms that should be included or excluded from consideration during the feature selection process.


select

FeatureClusterSet select(FeatureClusterSet set,
                         WeightingComponents wc,
                         int numTrainingDocs,
                         int numFeatures,
                         SearchEngine engine)
Selects the features from the documents in the training set.

Parameters:
set - the set of feature clusters from the training set.
numFeatures - the number of features to select.
Returns:
a sorted set of the features

setStopWords

void setStopWords(StopWords stopWords)
Sets a stopword list: words that should be ignored when selecting features.

Parameters:
stopWords - the set of words to ignore when performing feature selection.