|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.sun.labs.minion.indexer.dictionary.BasicField
public class BasicField
A class to hold the data for a saved field during indexing.
FieldInfo
,
MemoryFieldStore
Nested Class Summary | |
---|---|
class |
BasicField.Fetcher
A class that can be used when you want to get a lot of field values for a particular field, for example, when sorting or clustering results by a particular field. |
Field Summary | |
---|---|
protected DiskBiGramDictionary |
bigrams
A bigram dictionary that we can use for character fields. |
protected CDateParser |
dp
A date parser for date fields. |
protected ReadableBuffer |
dtvData
A buffer containing the actual dtv data at query time. |
protected ReadableBuffer |
dtvOffsets
A buffer containing the dtv offsets at query time. |
protected java.util.List[] |
dv
An array of the sets of entries stored per document at indexing time. |
protected int |
dvPos
The current postition in the dtv array, that is, where the next document ID will be added. |
protected FieldInfo |
field
The field info object for this field. |
protected com.sun.labs.minion.indexer.dictionary.SavedFieldHeader |
header
The header for this field. |
protected static java.lang.String |
logTag
The log tag. |
protected int |
nBytes
The number of bytes we're using to store data. |
protected Dictionary |
values
A dictionary to use for the saved field data. |
Constructor Summary | |
---|---|
protected |
BasicField()
Default constructor for subclasses. |
|
BasicField(FieldInfo field)
Constructs a saved field that will be used to store data during indexing. |
|
BasicField(FieldInfo field,
java.io.RandomAccessFile dictFile,
java.io.RandomAccessFile[] postFiles,
DictionaryFactory fieldStoreDictFactory,
DictionaryFactory bigramDictFactory,
DiskPartition part)
Constructs a saved field that will be used to retrieve data during querying. |
Method Summary | |
---|---|
void |
add(int docID,
java.lang.Object data)
Adds data to a saved field. |
void |
clear()
Clears a saved field, if it's open for indexing. |
int |
compareTo(java.lang.Object o)
Compares saved fields according to the field ID. |
void |
dump(java.lang.String path,
java.io.RandomAccessFile dictFile,
PostingsOutput[] postOut,
int maxID)
Writes the data to the provided stream. |
QueryEntry |
get(java.lang.Object v,
boolean caseSensitive)
Gets a particular value from the field. |
static java.lang.Class |
getEntryClass(FieldInfo field)
Gets an entry class appropriate to the type of the given field. |
protected java.lang.Object |
getEntryName(java.lang.Object val)
Gets a name for a given saved value, parsing as necessary. |
BasicField.Fetcher |
getFetcher()
|
FieldInfo |
getField()
Get the field info object for this field. |
java.util.SortedSet<FieldValue> |
getMatching(java.lang.String pattern)
|
protected static NameDecoder |
getNameDecoder(FieldInfo field)
Gets a name decoder of the appropriate type for the given field. |
protected static NameEncoder |
getNameEncoder(FieldInfo field)
Gets a name encoder of the appropriate type for the given field. |
java.lang.Object |
getSavedData(int docID,
boolean all)
Retrieve data from a saved field. |
ArrayGroup |
getSimilar(ArrayGroup ag,
java.lang.String value,
boolean matchCase)
|
ArrayGroup |
getUndefined(ArrayGroup ag)
Gets a group of all the documents that do not have any values saved for this field. |
boolean |
hasSavedValues(int docID)
Indicates whether a given document has saved data for this field. |
DictionaryIterator |
iterator(java.lang.Object lowerBound,
boolean includeLower,
java.lang.Object upperBound,
boolean includeUpper)
Gets an iterator for the values in this field. |
void |
merge(java.lang.String path,
SavedField[] fields,
int maxID,
int[] starts,
int[] nUndel,
int[][] docIDMaps,
java.io.RandomAccessFile dictFile,
PostingsOutput postOut)
Merges a number of saved fields. |
int |
size()
Gets the number of saved terms that we're storing. |
protected java.util.Iterator |
valueIterator()
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected FieldInfo field
protected Dictionary values
protected java.util.List[] dv
protected int dvPos
protected ReadableBuffer dtvOffsets
protected ReadableBuffer dtvData
protected DiskBiGramDictionary bigrams
protected int nBytes
protected com.sun.labs.minion.indexer.dictionary.SavedFieldHeader header
protected CDateParser dp
protected static java.lang.String logTag
Constructor Detail |
---|
protected BasicField()
public BasicField(FieldInfo field)
field
- The FieldInfo
for this saved field.public BasicField(FieldInfo field, java.io.RandomAccessFile dictFile, java.io.RandomAccessFile[] postFiles, DictionaryFactory fieldStoreDictFactory, DictionaryFactory bigramDictFactory, DiskPartition part) throws java.io.IOException
field
- The FieldInfo
for this saved field.dictFile
- The file containing the dictionary for this field.postFiles
- The files containing the postings for this field.part
- The disk partition that this field is associated with.
java.io.IOException
- if there is any error loading the field
data.Method Detail |
---|
public static java.lang.Class getEntryClass(FieldInfo field)
public void add(int docID, java.lang.Object data)
add
in interface SavedField
docID
- the document ID for the document containing the saved
datadata
- The actual field data.protected static NameDecoder getNameDecoder(FieldInfo field)
field
- The field for which we want a name decoder.protected static NameEncoder getNameEncoder(FieldInfo field)
field
- The field for which we want a name encoder.protected java.lang.Object getEntryName(java.lang.Object val)
val
- The value that we were passed.
public void dump(java.lang.String path, java.io.RandomAccessFile dictFile, PostingsOutput[] postOut, int maxID) throws java.io.IOException
dump
in interface SavedField
path
- The path of the index directory.dictFile
- The file where the dictionary will be written.postOut
- A place to write the postings associated with the
values.maxID
- The maximum document ID for this partition.
java.io.IOException
- if there is an error during the
writing.public boolean hasSavedValues(int docID)
docID
- the document ID for the document that we wish to check.
true
if this document ID has saved values,
false
otherwise.public java.lang.Object getSavedData(int docID, boolean all)
getSavedData
in interface SavedField
docID
- the document ID that we want data for.all
- If true
, return all known values for the
field in the given document. If false
return only one
value.
all
is true
, then return a
List
of the values stored in the given field in the
given document. If all
is false
, a single
value of the appropriate type will be returned.
If the given name is not the name of a saved field, or the document
ID is invalid, null
will be returned.
public QueryEntry get(java.lang.Object v, boolean caseSensitive)
get
in interface SavedField
v
- The value to get.caseSensitive
- If true, case should be taken into account when
iterating through the values. This value will only be observed for
character fields!
null
if
that term doesn't occur in the indexed material.public ArrayGroup getUndefined(ArrayGroup ag)
getUndefined
in interface SavedField
ag
- a set of documents to which we should restrict the search for
documents with undefined field values. If this is null
then
there is no such restriction.
public ArrayGroup getSimilar(ArrayGroup ag, java.lang.String value, boolean matchCase)
public java.util.SortedSet<FieldValue> getMatching(java.lang.String pattern)
public DictionaryIterator iterator(java.lang.Object lowerBound, boolean includeLower, java.lang.Object upperBound, boolean includeUpper)
iterator
in interface SavedField
lowerBound
- the name of the entry that will be the lower bound of
the iterator, or null
if there is no such boundincludeLower
- whether the lower bound should be included in the
results of the iteratorupperBound
- the name of the entry that will be the upper bound of
the iterator, or null
if there is no such boundincludeUpper
- whether the upper bound should be included in the
results of the iterator
public void merge(java.lang.String path, SavedField[] fields, int maxID, int[] starts, int[] nUndel, int[][] docIDMaps, java.io.RandomAccessFile dictFile, PostingsOutput postOut) throws java.io.IOException
merge
in interface SavedField
path
- The path to the index directory.fields
- An array of fields to merge.maxID
- The max doc ID in the new partitionstarts
- The new starting document IDs for the partitions.docIDMaps
- A map for each partition from old document IDs to
new document IDs. IDs that map to a value less than 0 have been
deleted. A null array means that the old IDs are the new IDs.dictFile
- The file to which the merged dictionaries will be
written.postOut
- The output to which the merged postings will be
written.nUndel
- The number of undeleted documents in each partition
java.io.IOException
- if there is an error during the merge.protected java.util.Iterator valueIterator()
public int size()
size
in interface SavedField
public void clear()
clear
in interface SavedField
public int compareTo(java.lang.Object o)
compareTo
in interface java.lang.Comparable
public FieldInfo getField()
getField
in interface SavedField
public BasicField.Fetcher getFetcher()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |