org.egothor.repository
Class DocumentsDB

java.lang.Object
  extended by org.egothor.repository.DocumentsDB
All Implemented Interfaces:
DataRepository

public class DocumentsDB
extends java.lang.Object
implements DataRepository

DocumentsDB implements a documents DB structure on disc. The structure consists of two files: the first one is used as an index, and the second one as a data store. When you want to read the element with key uid the algorithm works as follows:

  1. Read an offset of the element: firstfile.seek( uid*SIZEOFOFFSET ); offset = firstfile.read();
  2. Read the element: secondfile.seek( offset ); object = secondfile.readTheObject();

Author:
Leo Galambos

Nested Class Summary
 
Nested classes/interfaces inherited from interface org.egothor.repository.DataRepository
DataRepository.TupleSequence
 
Constructor Summary
DocumentsDB(java.lang.String location, boolean compressed)
          Constructor for the DocumentsDB object.
 
Method Summary
 int addItem(long uid, byte[] document, int length)
          Adds another document into the repository.
 void close()
          Closes the structure.
 void destroy()
          Destroy this data structure.
 DataInputStream elementAt(long uid, int revision)
          Retrieves a data block.
 DataRepository.TupleSequence elements()
          The tuples are [long:uid;int:rev;Object:DataInputStream].
protected  void finalize()
          Close this structure and attempt garbage collection.
 void flush()
           
 boolean removeDoc(long uid)
          Removes an element of the given uid.
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DocumentsDB

public DocumentsDB(java.lang.String location,
                   boolean compressed)
Constructor for the DocumentsDB object. Open the structure in the given directory. Three files are created, all with the name doc , with varying extensions depending on the file's function: information about removed elements is stored in doc.btm , the first file of offsets in doc.idx , and the second file with elements data in doc.dta .

Parameters:
location - the location where the files will be created
compressed - true iff the DocumentData will be saved in gzip format
Throws:
java.io.IOException - if an I/O error occurs
Method Detail

addItem

public int addItem(long uid,
                   byte[] document,
                   int length)
Description copied from interface: DataRepository
Adds another document into the repository. If the implementation is able to recognize whether the incoming data block is still the same (as the existing in the repository), it would discard the insertion request and return 0 as the signal of no-op.

Specified by:
addItem in interface DataRepository
Returns:
revision number (0 iff unchanged) or -1 when fails

elementAt

public DataInputStream elementAt(long uid,
                                 int revision)
Description copied from interface: DataRepository
Retrieves a data block.

Specified by:
elementAt in interface DataRepository
Parameters:
uid - the key of the block
revision - revision number of the block, 0 is used for the latest (current) revision
Returns:
null if the revision is not available (or the key is unknown)

destroy

public void destroy()
Destroy this data structure. First, it calls close(). Then it removes these files from the directory where the structure is stored: bitmap, idocs, docs.

Specified by:
destroy in interface DataRepository

removeDoc

public boolean removeDoc(long uid)
Removes an element of the given uid.

Parameters:
uid - the element to remove
Returns:
true if successful, false otherwise

close

public void close()
Closes the structure. It calls #commit and then it closes both data files (idocs and docs).

Specified by:
close in interface DataRepository
See Also:
#DocumentsDB(String)

elements

public DataRepository.TupleSequence elements()
The tuples are [long:uid;int:rev;Object:DataInputStream].

Specified by:
elements in interface DataRepository

flush

public void flush()
Specified by:
flush in interface DataRepository

finalize

protected void finalize()
                 throws java.lang.Throwable
Close this structure and attempt garbage collection.

Overrides:
finalize in class java.lang.Object
Throws:
java.lang.Throwable - you never know what might happen!