com.sun.labs.minion.classification
Class ContingencyFeatureClusterer

java.lang.Object
  extended by com.sun.labs.minion.classification.ContingencyFeatureClusterer
All Implemented Interfaces:
FeatureClusterer
Direct Known Subclasses:
KnowledgeSourceClusterer, LiteMorphClusterer, MorphClusterer, StemmingClusterer

public class ContingencyFeatureClusterer
extends java.lang.Object
implements FeatureClusterer

This class provides an implementation of a feature clusterer that clusters contingency features. This class is meant to provide structure for more sophisticated clusterers, and as such its clustering is very basic - it only clusters terms into groups by themselves.


Field Summary
protected  FeatureClusterSet clusters
          Internal storage for the clusters that are generated
protected  java.lang.String field
          The field from which features should be drawn.
protected static java.lang.String logTag
          The log tag
protected  int type
          The type of contingency feature to use.
 
Constructor Summary
ContingencyFeatureClusterer()
           
ContingencyFeatureClusterer(int type)
          A Feature Clusterer operates on a set of featurees.
 
Method Summary
protected  void addFeature(ContingencyFeature cf)
          Adds a feature to this feature clusterer.
 FeatureClusterSet cluster(ResultSetImpl s)
          Creates a set of clusters based on all of the terms in the documents contained in the ResultSet.
protected  java.util.Set<ContingencyFeature> collectFeatures(ArrayGroup ag)
          Collects terms from the array group, creating contingency features for each one.
protected  FeatureClusterSet getClusters()
          Returns a set of feature clusters.
 FeatureCluster newCluster()
          A non-static factory method to create a feature cluster
 Feature newFeature()
          A non-static factory method to create a feature of the type used by this clusterer
 FeatureClusterer newInstance()
          A non-static factory method to create a feature clusterer
 void setDocCache(DocCache dc)
          Sets the cache of document vectors that we can use to fetch the words in a given document.
 void setField(java.lang.String field)
          Sets the field from which features should be drawn.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

logTag

protected static java.lang.String logTag
The log tag


field

protected java.lang.String field
The field from which features should be drawn.


type

protected int type
The type of contingency feature to use.


clusters

protected FeatureClusterSet clusters
Internal storage for the clusters that are generated

Constructor Detail

ContingencyFeatureClusterer

public ContingencyFeatureClusterer()

ContingencyFeatureClusterer

public ContingencyFeatureClusterer(int type)
A Feature Clusterer operates on a set of featurees. When a new clusterer is created, a new set of clusters will be operated on. Sets of clusters correspond to partitions, and these partitions will likely be merged.

Method Detail

newInstance

public FeatureClusterer newInstance()
Description copied from interface: FeatureClusterer
A non-static factory method to create a feature clusterer

Specified by:
newInstance in interface FeatureClusterer
Returns:
a feature clusterer instance

newCluster

public FeatureCluster newCluster()
Description copied from interface: FeatureClusterer
A non-static factory method to create a feature cluster

Specified by:
newCluster in interface FeatureClusterer
Returns:
a feature cluster instance

newFeature

public Feature newFeature()
Description copied from interface: FeatureClusterer
A non-static factory method to create a feature of the type used by this clusterer

Specified by:
newFeature in interface FeatureClusterer
Returns:
a feature instance

setDocCache

public void setDocCache(DocCache dc)
Description copied from interface: FeatureClusterer
Sets the cache of document vectors that we can use to fetch the words in a given document.

Specified by:
setDocCache in interface FeatureClusterer

cluster

public FeatureClusterSet cluster(ResultSetImpl s)
Creates a set of clusters based on all of the terms in the documents contained in the ResultSet.

Specified by:
cluster in interface FeatureClusterer
Parameters:
s - the set of documents from which features are gathered
Returns:
a set of clusters (Features) of features

addFeature

protected void addFeature(ContingencyFeature cf)
Adds a feature to this feature clusterer. This may create a new cluster for the feature, or may add the feature to an existing cluster. If you're writing your own clusterer, override this method.

Parameters:
cf - the feature to add

getClusters

protected FeatureClusterSet getClusters()
Returns a set of feature clusters. Feature Clusters in this case will be ContingencyFeatures that represent clusters rather than single features. If you're writing your own clusterer, override this method.

Returns:
a set of clusters

collectFeatures

protected java.util.Set<ContingencyFeature> collectFeatures(ArrayGroup ag)
Collects terms from the array group, creating contingency features for each one.

Parameters:
ag - the array group
Returns:
a set of contingency features

setField

public void setField(java.lang.String field)
Description copied from interface: FeatureClusterer
Sets the field from which features should be drawn.

Specified by:
setField in interface FeatureClusterer
Parameters:
field - the name of a vectored field upon which the clustering should be based. A value of null indicates that all vectored fields should be considered, while an empty string indicates that data in no explicit field should be considered.