com.sun.labs.minion.classification
Class QueryZone

java.lang.Object
  extended by com.sun.labs.minion.classification.QueryZone

public class QueryZone
extends java.lang.Object

A query zone is a set of documents that are centered around a set of feature clusters. The features in the clusters are organized into a big query, and results are returned in score order. The results will contain a mix of documents in and out of the training set.


Field Summary
protected  java.lang.String fromField
          The field we're building classifiers from.
protected  java.util.PriorityQueue<com.sun.labs.minion.classification.QueryZone.HE> h
          The contents of the query zone
protected  java.util.List<com.sun.labs.minion.classification.QueryZone.HE> heapElements
          A list of all the heap elements
protected  PartitionManager manager
          The partition manager for the documents in the collection
protected  java.util.Map<DiskPartition,TermCache> termCaches
          A term cache to use when building classifiers.
protected  ResultSetImpl training
          The training set for the class under construction
protected  WeightingComponents wc
          The weighting components to use
protected  WeightingFunction wf
          The weighting function to use
protected  FeatureClusterSet wfClusters
          The features that we will use for our model.
 
Constructor Summary
QueryZone(ResultSetImpl training, java.lang.String fromField, FeatureClusterSet featureClusters, java.util.Map<java.lang.String,TermStatsImpl> termStats, java.util.Map<DiskPartition,TermCache> termCaches, PartitionManager manager)
           
 
Method Summary
 java.util.PriorityQueue<com.sun.labs.minion.classification.QueryZone.HE> computeZone()
           
 java.util.PriorityQueue<com.sun.labs.minion.classification.QueryZone.HE> getHeap()
          Gets a heap that represents the members of the query zone.
 FeatureClusterSet getWeightedClusters()
          Returns the feature clusters used to determine this query zone with collection level weights assigned to them.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

fromField

protected java.lang.String fromField
The field we're building classifiers from.


termCaches

protected java.util.Map<DiskPartition,TermCache> termCaches
A term cache to use when building classifiers.


wfClusters

protected FeatureClusterSet wfClusters
The features that we will use for our model.


manager

protected PartitionManager manager
The partition manager for the documents in the collection


training

protected ResultSetImpl training
The training set for the class under construction


wf

protected WeightingFunction wf
The weighting function to use


wc

protected WeightingComponents wc
The weighting components to use


h

protected java.util.PriorityQueue<com.sun.labs.minion.classification.QueryZone.HE> h
The contents of the query zone


heapElements

protected java.util.List<com.sun.labs.minion.classification.QueryZone.HE> heapElements
A list of all the heap elements

Constructor Detail

QueryZone

public QueryZone(ResultSetImpl training,
                 java.lang.String fromField,
                 FeatureClusterSet featureClusters,
                 java.util.Map<java.lang.String,TermStatsImpl> termStats,
                 java.util.Map<DiskPartition,TermCache> termCaches,
                 PartitionManager manager)
Method Detail

computeZone

public java.util.PriorityQueue<com.sun.labs.minion.classification.QueryZone.HE> computeZone()

getHeap

public java.util.PriorityQueue<com.sun.labs.minion.classification.QueryZone.HE> getHeap()
Gets a heap that represents the members of the query zone. Each element of the heap contains a big query and an array group that represent all the results from a partition. The heap elements provide method to facilitate access to the current/top document from the partition that the element represents.

Returns:
the heap of all documents in the query zone

getWeightedClusters

public FeatureClusterSet getWeightedClusters()
Returns the feature clusters used to determine this query zone with collection level weights assigned to them. Note that once computeZone is called, the feature clusters returned will be normalized. Before computeZone is called they are not. A copy of the feature clusters should be made if the clusters will be modified.

Returns:
the internal set of feature clusters