Class DocumentsDB

  extended by org.egothor.repository.DocumentsDB
All Implemented Interfaces:

public class DocumentsDB
extends java.lang.Object
implements DataRepository

DocumentsDB implements a documents DB structure on disc. The structure consists of two files: the first one is used as an index, and the second one as a data store. When you want to read the element with key uid the algorithm works as follows:

  1. Read an offset of the element: uid*SIZEOFOFFSET ); offset =;
  2. Read the element: offset ); object = secondfile.readTheObject();

Leo Galambos

Nested Class Summary
Nested classes/interfaces inherited from interface org.egothor.repository.DataRepository
Constructor Summary
DocumentsDB(java.lang.String location, boolean compressed)
          Constructor for the DocumentsDB object.
Method Summary
 int addItem(long uid, byte[] document, int length)
          Adds another document into the repository.
 void close()
          Closes the structure.
 void destroy()
          Destroy this data structure.
 DataInputStream elementAt(long uid, int revision)
          Retrieves a data block.
 DataRepository.TupleSequence elements()
          The tuples are [long:uid;int:rev;Object:DataInputStream].
protected  void finalize()
          Close this structure and attempt garbage collection.
 void flush()
 boolean removeDoc(long uid)
          Removes an element of the given uid.
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public DocumentsDB(java.lang.String location,
                   boolean compressed)
Constructor for the DocumentsDB object. Open the structure in the given directory. Three files are created, all with the name doc , with varying extensions depending on the file's function: information about removed elements is stored in doc.btm , the first file of offsets in doc.idx , and the second file with elements data in doc.dta .

location - the location where the files will be created
compressed - true iff the DocumentData will be saved in gzip format
Throws: - if an I/O error occurs
Method Detail


public int addItem(long uid,
                   byte[] document,
                   int length)
Description copied from interface: DataRepository
Adds another document into the repository. If the implementation is able to recognize whether the incoming data block is still the same (as the existing in the repository), it would discard the insertion request and return 0 as the signal of no-op.

Specified by:
addItem in interface DataRepository
revision number (0 iff unchanged) or -1 when fails


public DataInputStream elementAt(long uid,
                                 int revision)
Description copied from interface: DataRepository
Retrieves a data block.

Specified by:
elementAt in interface DataRepository
uid - the key of the block
revision - revision number of the block, 0 is used for the latest (current) revision
null if the revision is not available (or the key is unknown)


public void destroy()
Destroy this data structure. First, it calls close(). Then it removes these files from the directory where the structure is stored: bitmap, idocs, docs.

Specified by:
destroy in interface DataRepository


public boolean removeDoc(long uid)
Removes an element of the given uid.

uid - the element to remove
true if successful, false otherwise


public void close()
Closes the structure. It calls #commit and then it closes both data files (idocs and docs).

Specified by:
close in interface DataRepository
See Also:


public DataRepository.TupleSequence elements()
The tuples are [long:uid;int:rev;Object:DataInputStream].

Specified by:
elements in interface DataRepository


public void flush()
Specified by:
flush in interface DataRepository


protected void finalize()
                 throws java.lang.Throwable
Close this structure and attempt garbage collection.

finalize in class java.lang.Object
java.lang.Throwable - you never know what might happen!