|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object com.sun.labs.minion.indexer.dictionary.FieldStore com.sun.labs.minion.indexer.dictionary.DiskFieldStore
public class DiskFieldStore
A field store that can be used for querying operations.
Field Summary | |
---|---|
protected static java.lang.String |
logTag
The tag for this module. |
protected int |
nDocs
The number of documents. |
protected DiskPartition |
part
The partition this field store is associated with. |
Fields inherited from class com.sun.labs.minion.indexer.dictionary.FieldStore |
---|
header, metaFile, savedFields |
Constructor Summary | |
---|---|
DiskFieldStore(DiskPartition part,
java.io.RandomAccessFile dictFile,
java.io.RandomAccessFile[] postFiles,
DictionaryFactory fieldStoreDictFactory,
DictionaryFactory bigramDictFactory,
MetaFile metaFile)
Reads the field store from the provided file. |
Method Summary | |
---|---|
void |
close()
Closes the field store. |
double[] |
euclideanDistance(double[] vec,
java.lang.String field)
Computes the euclidean distance between the given document and all documents. |
java.lang.Object |
getDefaultSavedFieldData(FieldInfo fi)
Get the default value for a saved field. |
java.lang.Object |
getDefaultSavedFieldData(java.lang.String name)
Get the default value for a saved field. |
BasicField.Fetcher |
getFetcher(FieldInfo fi)
|
BasicField.Fetcher |
getFetcher(java.lang.String field)
|
DictionaryIterator |
getFieldIterator(java.lang.String name,
boolean caseSensitive,
java.lang.Object lowerBound,
boolean includeLower,
java.lang.Object upperBound,
boolean includeUpper)
Gets an iterator for the values in a given range in a field. |
PostingsIterator |
getFieldPostings(java.lang.String name,
java.lang.Object value,
boolean caseSensitive)
Gets the postings associated with a particular field value. |
java.util.Iterator |
getFields(int docID)
Gets an interator for the field values for a given document. |
FieldInfo.Type |
getFieldType(java.lang.String name)
Gets the type of hte named field, if it is a saved field. |
java.util.SortedSet<FieldValue> |
getMatching(java.lang.String field,
java.lang.String pattern)
Gets the values for the given field that match the given pattern. |
DictionaryIterator |
getMatchingIterator(java.lang.String name,
java.lang.String val,
boolean caseSensitive,
int maxEntries,
long timeLimit)
Gets an iterator for the character saved field values that match a given wildcard pattern. |
SavedField |
getSavedField(FieldInfo fi)
Gets a saved field from a field name. |
SavedField |
getSavedField(java.lang.String name)
Gets a saved field from a field name. |
java.lang.Object |
getSavedFieldData(FieldInfo fi,
int docID,
boolean all)
|
java.lang.Object |
getSavedFieldData(java.lang.String name,
int docID,
boolean all)
Gets saved data for a particular field. |
java.util.Map<java.lang.String,java.util.List> |
getSavedFields(int docID)
Gets a map from saved field names to the saved field values for those fields. |
DictionaryIterator |
getSubstringIterator(java.lang.String name,
java.lang.String val,
boolean caseSensitive,
boolean starts,
boolean ends,
int maxEntries,
long timeLimit)
Gets an iterator for the character saved field values that contain a given substring. |
protected SavedField |
makeSavedField(FieldInfo fi,
java.io.RandomAccessFile dictFile,
java.io.RandomAccessFile[] postFiles,
DictionaryFactory fieldStoreDictFactory,
DictionaryFactory bigramDictFactory,
DiskPartition part)
Makes a saved field instance of the appropriate type. |
void |
merge(DiskFieldStore[] stores,
int maxID,
int[] starts,
int[] nUndel,
int[][] docIDMaps,
java.io.RandomAccessFile dictFile,
PostingsOutput postOut)
Merges a number of field stores into a single store. |
Methods inherited from class com.sun.labs.minion.indexer.dictionary.FieldStore |
---|
getFieldArray, getFieldID, getFieldInfo, getFieldInfo, getFieldName, getMultArray, getMultArray, getNFields, getVectoredFields, isSavedField |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected DiskPartition part
protected int nDocs
protected static java.lang.String logTag
Constructor Detail |
---|
public DiskFieldStore(DiskPartition part, java.io.RandomAccessFile dictFile, java.io.RandomAccessFile[] postFiles, DictionaryFactory fieldStoreDictFactory, DictionaryFactory bigramDictFactory, MetaFile metaFile) throws java.io.IOException
part
- The partition that this field store is associated with.dictFile
- The file containing the dictionaries for the saved fields.postFiles
- The files containing the postings for the saved fields.metaFile
- The meta file to use to get field information.
java.io.IOException
- if there is an error during readingMethod Detail |
---|
protected SavedField makeSavedField(FieldInfo fi, java.io.RandomAccessFile dictFile, java.io.RandomAccessFile[] postFiles, DictionaryFactory fieldStoreDictFactory, DictionaryFactory bigramDictFactory, DiskPartition part) throws java.io.IOException
java.io.IOException
public void close() throws java.io.IOException
java.io.IOException
public SavedField getSavedField(java.lang.String name)
public SavedField getSavedField(FieldInfo fi)
public FieldInfo.Type getFieldType(java.lang.String name)
getFieldType
in class FieldStore
name
- The name of the field.
public java.lang.Object getSavedFieldData(java.lang.String name, int docID, boolean all)
name
- The name of the field.docID
- The document whose field value we want.all
- If true
, return all known values for the
field in the given document. If false
return only one
value.
all
is true
, then return a
List
of the values stored in the given field in the
given document. The elements of the list will have a type that is
appropriate to the type of the saved field. If all
is
false
, a single value of the appropriate type will be
returned.
If the given name is not the name of a saved field, or the document
ID is invalid, null
will be returned.
public java.lang.Object getSavedFieldData(FieldInfo fi, int docID, boolean all)
public java.lang.Object getDefaultSavedFieldData(java.lang.String name)
public java.lang.Object getDefaultSavedFieldData(FieldInfo fi)
public double[] euclideanDistance(double[] vec, java.lang.String field)
public java.util.Iterator getFields(int docID)
public BasicField.Fetcher getFetcher(FieldInfo fi)
public BasicField.Fetcher getFetcher(java.lang.String field)
public java.util.SortedSet<FieldValue> getMatching(java.lang.String field, java.lang.String pattern)
field
- the saved, string field against whose values we will match.
If the named field is not saved or is not a string field, then null
will be returned.pattern
- the pattern for which we'll find matching field values.
null
will
be returned.public DictionaryIterator getFieldIterator(java.lang.String name, boolean caseSensitive, java.lang.Object lowerBound, boolean includeLower, java.lang.Object upperBound, boolean includeUpper)
name
- The name of the field we need an iterator for.caseSensitive
- If true, case should be taken into account when
iterating through the values. This value will only be observed for
character fields!lowerBound
- The lower bound on the iterator. If
null
, only the upper bound is considered and the
iteration will commence with the first term in the dictionary.includeLower
- If true
, then the lower bound will
be included in the terms returned by the iterator, if it occurs in
the dictionary.upperBound
- The upper bound on the iterator. If
null
, only the lower bound is considered and the
iteration will end at the last term in the dictionary.includeUpper
- If true
, then the upper bound will
be included in the terms returned by the iterator, if it occurs in
the dictionary.
null
if there is no such range or the named
field is not a saved field.public DictionaryIterator getMatchingIterator(java.lang.String name, java.lang.String val, boolean caseSensitive, int maxEntries, long timeLimit)
name
- The name of the field whose values we wish to match
against.val
- The wildcard value against which we will match.caseSensitive
- If true
, then case will be taken
into account during the match.maxEntries
- The maximum number of entries to return. If zero or
negative, return all possible entries.timeLimit
- The maximum amount of time (in milliseconds) to
spend trying to find matches. If zero or negative, no time limit is
imposed.public DictionaryIterator getSubstringIterator(java.lang.String name, java.lang.String val, boolean caseSensitive, boolean starts, boolean ends, int maxEntries, long timeLimit)
name
- The name of the field whose values we wish to match
against.val
- The substring that we are looking for.caseSensitive
- If true
, then case will be taken
into account during the match.starts
- If true
, the value must start with the
given substring.ends
- If true
, the value must end with the given
substring.maxEntries
- The maximum number of entries to return. If zero or
negative, return all possible entries.timeLimit
- The maximum amount of time (in milliseconds) to
spend trying to find matches. If zero or negative, no time limit is
imposed.public PostingsIterator getFieldPostings(java.lang.String name, java.lang.Object value, boolean caseSensitive)
name
- The name of the field for which we want postings.value
- The value from the field for which we want postings.caseSensitive
- If true, case should be taken into account when
iterating through the values. This value will only be observed for
character fields!
null
if there is no such value in the field.public void merge(DiskFieldStore[] stores, int maxID, int[] starts, int[] nUndel, int[][] docIDMaps, java.io.RandomAccessFile dictFile, PostingsOutput postOut) throws java.io.IOException
stores
- the field stores to merge.maxID
- The maximum document ID in the merged partition.starts
- The new starting document IDs for the partitions.nUndel
- The number of documents that are not deleted in each of the partitionsSdictFile
- The file where the merged dictionaries will be written.postOut
- The output where the merged postings will be written.docIDMaps
- The maps from old to new document IDs.
java.io.IOException
- when there is an error writing the
file.public java.util.Map<java.lang.String,java.util.List> getSavedFields(int docID)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |