|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface DocumentVector
An interface defining the behavior of document vectors. These are the
basis for all classification, clustering, and profiling activity. A
document vector can be obtained from a search result using the Result.getDocumentVector()
method.
The name is a bit misleading: an instance of this class can be used to represent a set of documents as easily as it can a single document.
Method Summary | |
---|---|
DocumentVector |
copy()
Creates a copy of the current document vector and returns it. |
boolean |
equals(java.lang.Object o)
Determines of two document vectors are equal. |
ResultSet |
findSimilar()
Finds documents that are similar to this one. |
ResultSet |
findSimilar(java.lang.String sortOrder)
Finds documents that are similar to this one. |
ResultSet |
findSimilar(java.lang.String sortOrder,
double skimPercent)
Finds documents that are similar to this one. |
java.lang.String |
getKey()
Gets the key for the document associated with this vector. |
float |
getSimilarity(DocumentVector vector)
Computes the similarity between this document vector and the supplied vector. |
java.util.Map<java.lang.String,java.lang.Float> |
getSimilarityTerms(DocumentVector vector)
Gets a HashMap of term names to weights, where the weights represent the amount the term contributed to the similarity of the two documents. |
java.util.Set<java.lang.String> |
getTerms()
Gets the set of terms in the document represented by this vector. |
java.util.Map<java.lang.String,java.lang.Float> |
getTopWeightedTerms(int nTerms)
Gets the n terms that have the highest document weight in this document vector. |
void |
setEngine(SearchEngine e)
Sets the search engine to use with this document vector. |
Method Detail |
---|
DocumentVector copy()
void setEngine(SearchEngine e)
e
- the engineboolean equals(java.lang.Object o)
equals
in class java.lang.Object
o
- the document vector to which this vector is compared
float getSimilarity(DocumentVector vector)
vector
- the vector representing the document to compare this vector to
ResultSet findSimilar()
ResultSet findSimilar(java.lang.String sortOrder)
sortOrder
- a string describing the order in which to sort the results
ResultSet findSimilar(java.lang.String sortOrder, double skimPercent)
sortOrder
- a string describing the order in which to sort the resultsskimPercent
- a number between 0 and 1 representing what percent of the features should be used to perform findSimilar
java.lang.String getKey()
java.util.Set<java.lang.String> getTerms()
java.util.Map<java.lang.String,java.lang.Float> getTopWeightedTerms(int nTerms)
nTerms
- the number of terms to return
java.util.Map<java.lang.String,java.lang.Float> getSimilarityTerms(DocumentVector vector)
vector
- the document vector to compare this one to
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |