|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object com.sun.labs.minion.indexer.dictionary.MemoryDictionary
public class MemoryDictionary
A dictionary that will be used during indexing. The entries will be
stored in a Map
.
The dictionary is instantiated with the class of the entries that it will contain. It provides the ability to make entries of the appropriate type, given the name that will map to that entry. We provide a special case for entries that store information for the case insensitive version of a particular name.
At the time that the dictionary is written to disk, the entries are sorted by name. At this time, the IDs of the entries may be reassigned in name order (they were originally assigned in order of addition to the dictionary.) If this is the case, then a mapping between old and new IDs will be stored and can be retrieved by anyone who needs it.
Note that the size of the dictionary at dump time and the
Nested Class Summary | |
---|---|
static class |
MemoryDictionary.IDMap
An enumeration of the kinds of ID maps that may have to be built when dumping a dictionary. |
class |
MemoryDictionary.MemoryDictionaryIterator
A class that implements a dictionary iterator for this dictionary. |
static class |
MemoryDictionary.Renumber
An enumeration of the kinds of renumbering that may need to be done when dumping a dictionary to disk. |
Field Summary | |
---|---|
protected java.lang.Class |
entryClass
The class of the entries that we will be holding. |
protected int |
id
The ID that we will assign to entries as they are added. |
protected int[] |
idMap
A map from the IDs assigned before sorting to the IDs assigned after sorting. |
protected static java.lang.String |
logTag
The tag for this module. |
protected java.util.Map<java.lang.Object,Entry> |
map
A map to hold the entries. |
protected Partition |
part
The partition with which this dictionary is associated. |
Constructor Summary | |
---|---|
MemoryDictionary(java.lang.Class entryClass)
Creates a dictionary that can be used during indexing. |
Method Summary | |
---|---|
void |
clear()
Clears the dictionary, emptying it of all data. |
IndexEntry[] |
dump(java.lang.String path,
NameEncoder encoder,
PartitionStats partStats,
java.io.RandomAccessFile dictFile,
PostingsOutput[] postOut,
MemoryDictionary.Renumber renumber,
MemoryDictionary.IDMap idMapType,
int[] postIDMap)
Dumps the dictionary and the associated postings to files. |
IndexEntry[] |
dump(java.lang.String path,
NameEncoder encoder,
java.io.RandomAccessFile dictFile,
PostingsOutput[] postOut,
MemoryDictionary.Renumber renumber,
MemoryDictionary.IDMap idMap,
int[] postIDMap)
Dumps the dictionary and the associated postings to files. |
void |
dumpPrepare(IndexEntry[] sortedEntries)
Prepares a dictionary for dumping. |
QueryEntry |
get(java.lang.Object name)
Gets an entry from the dictionary, given the name for the entry. |
java.lang.Class |
getEntryClass()
|
int[] |
getIdMap()
Gets a map from the IDs assigned before sorting to the IDs assigned after sorting. |
java.util.Set<java.lang.Object> |
getKeys()
|
int |
getMaxId()
Gets the largest ID in this dictionary as of the time the method is called. |
Partition |
getPartition()
Gets the partition to which this dictionary belongs. |
protected static int |
getSize(java.lang.Object name)
Given a name, figure out how big it is in bytes. |
DictionaryIterator |
iterator()
Gets an iterator for the entries in the dictionary. |
IndexEntry |
newEntry(java.lang.Object name)
Gets a new, possibly cased, entry that can be added to this dictionary. |
void |
processEntry(IndexEntry e)
Processes a single entry before dumping it. |
IndexEntry |
put(java.lang.Object name,
IndexEntry e)
Puts an entry into the dictionary. |
Entry |
remove(java.lang.Object name)
Deletes an entry from the dictionary, given the name for the entry. |
void |
setPartition(Partition partition)
|
protected IndexEntry |
simpleNewEntry(java.lang.Object name)
Gets a new entry that can be added to this dictionary. |
int |
size()
Gets the number of entries in the dictionary. |
protected IndexEntry[] |
sort(MemoryDictionary.Renumber renumber,
MemoryDictionary.IDMap idMapType)
Sorts the dictionary entries. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected Partition part
protected java.util.Map<java.lang.Object,Entry> map
protected java.lang.Class entryClass
protected int id
protected int[] idMap
renumber
parameter of dump(java.lang.String, com.sun.labs.minion.indexer.dictionary.NameEncoder, java.io.RandomAccessFile, com.sun.labs.minion.indexer.postings.io.PostingsOutput[], com.sun.labs.minion.indexer.dictionary.MemoryDictionary.Renumber, com.sun.labs.minion.indexer.dictionary.MemoryDictionary.IDMap, int[])
protected static java.lang.String logTag
Constructor Detail |
---|
public MemoryDictionary(java.lang.Class entryClass)
Method Detail |
---|
public java.lang.Class getEntryClass()
public java.util.Set<java.lang.Object> getKeys()
protected IndexEntry simpleNewEntry(java.lang.Object name)
name
- The name of the entry.
null
if there is
an error instantiating the entry.public IndexEntry newEntry(java.lang.Object name)
name
- The name of the new entry.
null
if there is an error instantiating
the entry.public IndexEntry put(java.lang.Object name, IndexEntry e)
put
in interface Dictionary
name
- The name of the entry.e
- The entry to put in the dictionary.
protected static int getSize(java.lang.Object name)
public QueryEntry get(java.lang.Object name)
get
in interface Dictionary
name
- The name of the entry.
null
if
the name doesn't appear in the dictionary.public Entry remove(java.lang.Object name)
name
- The name of the entry.
null
if
the name doesn't appear in the dictionary.public Partition getPartition()
getPartition
in interface Dictionary
public void setPartition(Partition partition)
public int size()
size
in interface Dictionary
public DictionaryIterator iterator()
iterator
in interface Dictionary
iterator
in interface java.lang.Iterable<QueryEntry>
public void clear()
protected IndexEntry[] sort(MemoryDictionary.Renumber renumber, MemoryDictionary.IDMap idMapType)
renumber
, new IDs may be assigned to the entries in
their new, sorted order.
renumber
- whether the entries in the dictionary should be
renumbered in order of the namesidMapType
- what kind of map (if any) should be kept between the
old and new IDs
public int getMaxId()
public int[] getIdMap()
public void dumpPrepare(IndexEntry[] sortedEntries)
sortedEntries
- entries from another dictionary.public IndexEntry[] dump(java.lang.String path, NameEncoder encoder, java.io.RandomAccessFile dictFile, PostingsOutput[] postOut, MemoryDictionary.Renumber renumber, MemoryDictionary.IDMap idMap, int[] postIDMap) throws java.io.IOException
path
- The path to the directory where the dictionary should be
dumped.encoder
- An encoder for the names of the entries.dictFile
- The file where the dictionary will be dumped.postOut
- The place where the postings will be dumped.renumber
- How entries should be renumbered at dump time.idMap
- what kind of map from old to new IDs should be keptpostIDMap
- A map from old IDs used in the postings to new IDs.
This map will be given to the postings from the dictionary before
they are dumped to disk, allowing the postings to be remapped before
the dump. This is useful when the postings in one dictionary
contain IDs that have been remapped during a dump operation, such as
those in a document dictionary. If this value is null
,
no remapping will take place.
java.io.IOException
- When there is an error writing either of
the channels.public IndexEntry[] dump(java.lang.String path, NameEncoder encoder, PartitionStats partStats, java.io.RandomAccessFile dictFile, PostingsOutput[] postOut, MemoryDictionary.Renumber renumber, MemoryDictionary.IDMap idMapType, int[] postIDMap) throws java.io.IOException
path
- The path to the directory where the dictionary should be
dumped.encoder
- An encoder for the names of the entries.partStats
- a set of partition statistics that we will
contribute to while dumping the dictionary. May be null
.dictFile
- The file where the dictionary will be dumped.postOut
- The place where the postings will be dumped.renumber
- An integer indicating whether and how entries should
be renumbered at dump time.postIDMap
- A map from old IDs used in the postings to new IDs.
This map will be given to the postings from the dictionary before
they are dumped to disk, allowing the postings to be remapped before
the dump. This is useful when the postings in one dictionary
contain IDs that have been remapped during a dump operation, such as
those in a document dictionary. If this value is null
,
no remapping will take place.
java.io.IOException
- When there is an error writing either of
the channels.public void processEntry(IndexEntry e)
e
- the entry to process.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |