com.sun.labs.minion.classification
Class WeightedFeatureVector

java.lang.Object
  extended by com.sun.labs.minion.classification.WeightedFeatureVector

public class WeightedFeatureVector
extends java.lang.Object

A class for holding a weighted feature vector. Such a vector may hold features drawn from a single index partition or features drawn from a number of index partitions. It is up to the user of the class to keep track of how each vector is used!


Field Summary
protected static java.lang.String logTag
           
protected  int nFeat
          The number of features in our vector.
protected  boolean normalized
          Whether we've been normalized.
protected  FeatureCluster[] v
          An array to hold the features clusters that make up our vector.
 
Constructor Summary
WeightedFeatureVector(java.util.Collection<FeatureCluster> c)
          Creates a feature vector from a collection of feature clusters.
WeightedFeatureVector(FeatureClusterSet fcs, float[] weights)
          Creates a feature vector from a collection of feature clusters and some associated weights
WeightedFeatureVector(int nFeat)
          Creates a feature vector that can hold a certain number of features.
WeightedFeatureVector(WeightedFeatureVector fv)
          Creates a feature vector that's a copy of the given vector.
 
Method Summary
 void add(FeatureCluster f)
          Adds a feature to this vector.
 WeightedFeatureVector add(WeightedFeatureVector fv)
          Adds a feature vector to this vector, returning a new vector.
 WeightedFeatureVector add(WeightedFeatureVector fv, float fac1, float fac2, boolean dropNegative)
          Adds a feature vector to this vector, returning a new vector.
 float dot(WeightedFeatureVector fv)
          Calculate the dot product of this feature vector and another feature vector.
 FeatureCluster getCluster(java.lang.String name)
           
static WeightedFeatureVector getCrossPartition(java.util.List vecs, FeatureClusterSet clusters)
          Given a number of partition-specific feature vectors, generate a new, cross-partition feature vector.
 FeatureClusterSet getSet()
          Gets a sorted set of features.
 WeightedFeature[] getWeightedFeatures()
          Gets an array of weighted feature from this WeightedFeatureVector.
 float length()
          Gets the euclidean length of this vector.
 WeightedFeatureVector mult(float s)
          Multiplies a feature vector by a scalar, producing a new vector.
 void normalize()
          Normalizes the length of this vector to 1.
 int size()
          Gets the size of this vector, which is the number of features it contains.
 WeightedFeatureVector sub(WeightedFeatureVector fv)
          Subtracts a feature vector from this vector, returning a new vector.
 WeightedFeatureVector sub(WeightedFeatureVector fv, boolean dropNegative)
          Subtracts a feature vector from this vector, returning a new vector.
 WeightedFeatureVector sub(WeightedFeatureVector fv, float fac1, float fac2, boolean dropNegative)
          Subtracts a feature vector from this vector, returning a new vector.
 java.lang.String toString()
           
 java.lang.String toString(int t)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

v

protected FeatureCluster[] v
An array to hold the features clusters that make up our vector. This array must be ordered by cluster name!


nFeat

protected int nFeat
The number of features in our vector.


normalized

protected boolean normalized
Whether we've been normalized.


logTag

protected static java.lang.String logTag
Constructor Detail

WeightedFeatureVector

public WeightedFeatureVector(java.util.Collection<FeatureCluster> c)
Creates a feature vector from a collection of feature clusters.

Parameters:
c - a collection of feature clusters from which we should make a feature vector.

WeightedFeatureVector

public WeightedFeatureVector(FeatureClusterSet fcs,
                             float[] weights)
Creates a feature vector from a collection of feature clusters and some associated weights

Parameters:
fcs - a collection of feature clusters
weights - weights associated with the clusters

WeightedFeatureVector

public WeightedFeatureVector(int nFeat)
Creates a feature vector that can hold a certain number of features.

Parameters:
nFeat - the initial number of features

WeightedFeatureVector

public WeightedFeatureVector(WeightedFeatureVector fv)
Creates a feature vector that's a copy of the given vector.

Parameters:
fv - the vector that we want to copy
Method Detail

size

public int size()
Gets the size of this vector, which is the number of features it contains.

Returns:
the number of features in this vector

add

public void add(FeatureCluster f)
Adds a feature to this vector. This method assumes that the feature has not already been added, so be careful to only add a feature once! Also, features should be added in name order (which should be the same as ID order!)

Parameters:
f - the feature cluster

getCluster

public FeatureCluster getCluster(java.lang.String name)

add

public WeightedFeatureVector add(WeightedFeatureVector fv,
                                 float fac1,
                                 float fac2,
                                 boolean dropNegative)
Adds a feature vector to this vector, returning a new vector.

Parameters:
fv - the vector to add to this one.
fac1 - a factor to apply to the weights in this vector
fac2 - a factor to apply to the weights in the other vector
dropNegative - if true features with a negative weight are left out of the resulting vector
Returns:
the vector representing the sum of the two vectors

add

public WeightedFeatureVector add(WeightedFeatureVector fv)
Adds a feature vector to this vector, returning a new vector.

Parameters:
fv - the vector to add to this one.
Returns:
a vector representing the sum of this vector and the provided vector

sub

public WeightedFeatureVector sub(WeightedFeatureVector fv,
                                 float fac1,
                                 float fac2,
                                 boolean dropNegative)
Subtracts a feature vector from this vector, returning a new vector.

Parameters:
fv - the vector to subtract from this one.
fac1 - a factor to apply to the weights in this vector
fac2 - a factor to apply to the weights in the other vector
dropNegative - if true features with a negative weight are left out of the resulting vector
Returns:
a vector representing the sum of this vector and the provided vector

sub

public WeightedFeatureVector sub(WeightedFeatureVector fv)
Subtracts a feature vector from this vector, returning a new vector.

Parameters:
fv - the vector to subtract from this one.
Returns:
a vector representing the difference of this vector and the provided vector

sub

public WeightedFeatureVector sub(WeightedFeatureVector fv,
                                 boolean dropNegative)
Subtracts a feature vector from this vector, returning a new vector.

Parameters:
fv - the vector to subtract from this one.
dropNegative - if true features with a negative weight are left out of the resulting vector
Returns:
a vector representing the difference of this vector and the provided vector

mult

public WeightedFeatureVector mult(float s)
Multiplies a feature vector by a scalar, producing a new vector.

Parameters:
s - a scalar that we want to multiply the weights by
Returns:
a vector representing the current vector multiplied by the provided scalar

dot

public float dot(WeightedFeatureVector fv)
Calculate the dot product of this feature vector and another feature vector.

Parameters:
fv - another weighted feature vector
Returns:
the dot product of the two vectors (i.e. the sum of the products of the components in each dimension)

getWeightedFeatures

public WeightedFeature[] getWeightedFeatures()
Gets an array of weighted feature from this WeightedFeatureVector. The WeightedFeatureVector is flattened -- that is, the resulting array will have an entry for each feature in each cluster where the weight of the feature is the weight of the cluster from which it came. The array is in order by WeightedFeature.

Returns:
an array of the weighted features from this vector of clusters

normalize

public void normalize()
Normalizes the length of this vector to 1.


length

public float length()
Gets the euclidean length of this vector.

Returns:
the length of this vector

getSet

public FeatureClusterSet getSet()
Gets a sorted set of features.

Returns:
a set of the features in this vector, sorted by name

getCrossPartition

public static WeightedFeatureVector getCrossPartition(java.util.List vecs,
                                                      FeatureClusterSet clusters)
Given a number of partition-specific feature vectors, generate a new, cross-partition feature vector. The resulting vector is a vector of feature clusters, where the features in the input vectors have been combined back into the clusters they originally came from and the cluster weights are the sums of the weights of the component features from the vectors.

Note that this could be done using straightforward addition, but this will be more efficient in space and time.

Parameters:
vecs - a list of weighted feature vectors from a number of partitions
clusters - the clusters that we want to make cross partition
Returns:
a cross-partition feature vector

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

toString

public java.lang.String toString(int t)