org.egothor.core.memory
Class Document

java.lang.Object
  extended by org.egothor.core.memory.Document
All Implemented Interfaces:
Barrel, BarrelReader

public class Document
extends java.lang.Object
implements Barrel, BarrelReader

The Document object represents a real document. A Document is characterized by its metadata and a tree structure of fields. A Document is a special kind of a Barrel.

Case 1: The Document contains just one field (our implementation).
That object represents a Barrel that has just one active document (until it is removed): the document itself; and it has a set of inverted lists. The length of each list is one item.

Case 2: The Document contains more fields.
Not supported in this release. Currently, our implementation adds a prefix to each term. That prefix is computed from the field name and its position in the tree of fields. For example:

Given the tree structure: [root_field, term1 term2 [child1, term11 term12] term1 [child2, term1 term2],
Egothor produces these inverted lists:

Users are welcome to implement another format of storing data. Document implements two basic interfaces: BarrelReader and Barrel. The first one allows the passing of a Document to a real Barrel (or rather, BarrelWriter) via BarrelWriter.append(org.egothor.core.BarrelReader).

Author:
Leo Galambos

Field Summary
protected  java.util.TreeMap<java.lang.String,org.egothor.core.memory.MemoryProximities> ilists
          The inverted lists constructed from terms in this document.
 
Constructor Summary
Document()
          Constructor for the Document object
 
Method Summary
 void close()
          Not implemented but required by Barrel.
 void commit()
          Not implemented but required by Barrel.
 long deleted()
          Return the number of removed documents.
 void destroy()
          Not implemented.
 Sequence<IListMetadata> expand(java.lang.String expr)
          Not implemented.
 Bitmap getBitmap(java.lang.String label)
          Return the Bitmap of a given label.
 SequenceWithClose<DocumentData> getDocuments()
          Return an Enumeration of the documents.
 IListMetadata getIListMeta(java.lang.String term)
          Return a simple IListMetadata structure that computes its getLength() as size()-deleted().
 SequenceWithClose<IListReader> getILists()
          Return all inverted lists in A-Z order of terms.
 DocumentData getMeta(long id)
          Return the metadata of this document.
 void initialize(DocumentData meta, FTField root)
          Initializator for the Document object.
 boolean isWithoutTerms()
          Test whether this document contains at least one term.
 long length()
          Return the length of this data structure.
 BarrelReader open()
          Return this object.
 IListReader openIList(java.lang.String term, boolean clean)
          Opens the IListReader of the given term.
 void query(Query q, ResultList result)
          Not implemented..
 boolean removeDoc(long id)
          Remove this document if and only if id is equal to root's field element UID.
 void rewind()
          Restart this BarrelReader so that the documents can be read again.
 void setBitmap(java.lang.String label, Bitmap bitmap)
          Try to set the Bitmap of a given label.
 long size()
          Return the size of this data structure.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ilists

protected java.util.TreeMap<java.lang.String,org.egothor.core.memory.MemoryProximities> ilists
The inverted lists constructed from terms in this document.

Constructor Detail

Document

public Document()
Constructor for the Document object

Method Detail

getBitmap

public Bitmap getBitmap(java.lang.String label)
Description copied from interface: Barrel
Return the Bitmap of a given label.

Specified by:
getBitmap in interface Barrel
Parameters:
label - the label of the requested Bitmap
Returns:
Bitmap of removed documents when label is null

getDocuments

public SequenceWithClose<DocumentData> getDocuments()
Return an Enumeration of the documents. As implemented this data structure has one element. An empty Enumeration will be returned when the document has been removed.

Specified by:
getDocuments in interface BarrelReader
Returns:
an Sequence

getIListMeta

public IListMetadata getIListMeta(java.lang.String term)
Return a simple IListMetadata structure that computes its getLength() as size()-deleted().

Specified by:
getIListMeta in interface Barrel
Parameters:
term - the term for which to search this Document's Hashtable
Returns:
an IListMetadata object, or null if the term is not in the Hashtable

getILists

public SequenceWithClose<IListReader> getILists()
Return all inverted lists in A-Z order of terms. An empty enumeration will be returned when the document does not contain any terms.

Specified by:
getILists in interface BarrelReader
Returns:
an Sequence of the values in the Document's Hashtable

getMeta

public DocumentData getMeta(long id)
Return the metadata of this document.

Specified by:
getMeta in interface Barrel
Parameters:
id - the ID of the desired document
Returns:
a DocMetaDocumentDataning the metadata of the desired document, or null if id is not equal to org.egothor.Constants.FIRSTUID
See Also:
Constants.FIRSTUID

isWithoutTerms

public boolean isWithoutTerms()
Test whether this document contains at least one term.


initialize

public void initialize(DocumentData meta,
                       FTField root)
Initializator for the Document object. Stores meta and root to local variables and calls FTField.invertize(). It prepares the inverted lists structure.

Parameters:
meta - contains the metadata of the Document
root - this Document's place in the tree (in this implementation the tree has one element making this Document the root)

expand

public Sequence<IListMetadata> expand(java.lang.String expr)
Not implemented.

Specified by:
expand in interface Barrel
Parameters:
expr - the expression
Returns:
null is returned for this implementation

query

public void query(Query q,
                  ResultList result)
Not implemented..

Specified by:
query in interface Barrel
Parameters:
q - the query
result - will calculate similarity to the query

open

public BarrelReader open()
Return this object. This is possible because this class implements BarrelReader.

Specified by:
open in interface Barrel
Returns:
this object

destroy

public void destroy()
Not implemented. This object is in memory so the job will be done by the garbage collector.

Specified by:
destroy in interface Barrel

size

public long size()
Return the size of this data structure.

Specified by:
size in interface Barrel
Returns:
1

length

public long length()
Return the length of this data structure.

Specified by:
length in interface BarrelReader
Returns:
1 if the document has not been deleted, 0 if it has

deleted

public long deleted()
Return the number of removed documents.

Specified by:
deleted in interface Barrel
Returns:
0 if the document was not removed or 1 if the document was removed
See Also:
removeDoc(long)

removeDoc

public boolean removeDoc(long id)
Remove this document if and only if id is equal to root's field element UID.

Specified by:
removeDoc in interface Barrel
Parameters:
id - the ID of the document to remove
Returns:
true if the root's ID is equal to the given id and the document has been deleted

close

public void close()
Not implemented but required by Barrel.

Specified by:
close in interface Barrel
Specified by:
close in interface BarrelReader

rewind

public void rewind()
Restart this BarrelReader so that the documents can be read again. The operation does not need any action in this implementation.

Specified by:
rewind in interface BarrelReader

commit

public void commit()
Not implemented but required by Barrel.

Specified by:
commit in interface Barrel

openIList

public IListReader openIList(java.lang.String term,
                             boolean clean)
Opens the IListReader of the given term.

Specified by:
openIList in interface Barrel
Parameters:
term - the term to open an IListReader for
clean - whether to remove all the items denoted as deleted
Returns:
An IListReader, or null if term is a term that does not occur in the document

setBitmap

public void setBitmap(java.lang.String label,
                      Bitmap bitmap)
Description copied from interface: Barrel
Try to set the Bitmap of a given label.

Specified by:
setBitmap in interface Barrel
Parameters:
label - the label of the requested Bitmap
bitmap - Bitmap of removed documents when label is null