|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
public interface DocumentVector
An interface defining the behavior of document vectors. These are the
basis for all classification, clustering, and profiling activity. A
document vector can be obtained from a search result using the Result.getDocumentVector() method.
The name is a bit misleading: an instance of this class can be used to represent a set of documents as easily as it can a single document.
| Method Summary | |
|---|---|
DocumentVector |
copy()
Creates a copy of the current document vector and returns it. |
boolean |
equals(java.lang.Object o)
Determines of two document vectors are equal. |
ResultSet |
findSimilar()
Finds documents that are similar to this one. |
ResultSet |
findSimilar(java.lang.String sortOrder)
Finds documents that are similar to this one. |
ResultSet |
findSimilar(java.lang.String sortOrder,
double skimPercent)
Finds documents that are similar to this one. |
java.lang.String |
getKey()
Gets the key for the document associated with this vector. |
float |
getSimilarity(DocumentVector vector)
Computes the similarity between this document vector and the supplied vector. |
java.util.Map<java.lang.String,java.lang.Float> |
getSimilarityTerms(DocumentVector vector)
Gets a HashMap of term names to weights, where the weights represent the amount the term contributed to the similarity of the two documents. |
java.util.Set<java.lang.String> |
getTerms()
Gets the set of terms in the document represented by this vector. |
java.util.Map<java.lang.String,java.lang.Float> |
getTopWeightedTerms(int nTerms)
Gets the n terms that have the highest document weight in this document vector. |
void |
setEngine(SearchEngine e)
Sets the search engine to use with this document vector. |
| Method Detail |
|---|
DocumentVector copy()
void setEngine(SearchEngine e)
e - the engineboolean equals(java.lang.Object o)
equals in class java.lang.Objecto - the document vector to which this vector is compared
float getSimilarity(DocumentVector vector)
vector - the vector representing the document to compare this vector to
ResultSet findSimilar()
ResultSet findSimilar(java.lang.String sortOrder)
sortOrder - a string describing the order in which to sort the results
ResultSet findSimilar(java.lang.String sortOrder,
double skimPercent)
sortOrder - a string describing the order in which to sort the resultsskimPercent - a number between 0 and 1 representing what percent of the features should be used to perform findSimilar
java.lang.String getKey()
java.util.Set<java.lang.String> getTerms()
java.util.Map<java.lang.String,java.lang.Float> getTopWeightedTerms(int nTerms)
nTerms - the number of terms to return
java.util.Map<java.lang.String,java.lang.Float> getSimilarityTerms(DocumentVector vector)
vector - the document vector to compare this one to
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||