org.egothor.duplicity.algorithm
Class PermutatedMinsFiller

java.lang.Object
  extended by org.egothor.duplicity.algorithm.PermutatedMinsFiller
Direct Known Subclasses:
DocumentLevelPermutatedMinsFiller, ParagraphLevelPermutatedMinsFiller, SentenceLevelPermutatedMinsFiller

public abstract class PermutatedMinsFiller
extends java.lang.Object

Fills the DocumentPermutatedMins with values corrensponding to the given sequence of tokens. This class takes care to allways return the child appropriate for working on the right duplicity checking level (according to the Constants.CHECK_DUPLICITY_LEVEL).

Author:
Kate�ina Dufkov�

Constructor Summary
protected PermutatedMinsFiller(long seed)
          Creates the permutations.
 
Method Summary
 void computeDocumentMins(DocumentPermutatedMins result, Sequence<Token> terms, long documentUID, int documentDBRevision)
          Computes the permutated mins values for given sequence of tokens of a document and fills it into the result under the identificator documentID.
static PermutatedMinsFiller createNew()
          Ensures that the right child is created according to the duplicity checking algorithm level set in Constants.CHECK_DUPLICITY_LEVEL.
static PermutatedMinsFiller createNew(long seed)
          Ensures that the right child is created according to the duplicity checking algorithm level set in Constants.CHECK_DUPLICITY_LEVEL.
 long getSeed()
           
protected abstract  void insertMins(DocumentPermutatedMins result, long documentUID, int documentDBRevision, short paragraph, short sentenceInParagraph, long[] mins)
          Inserts all mins to result.
protected abstract  boolean newUnit(short lastParagraph, int lastSentence, short paragraph, int sentence)
          Returns true, if a new unit on given duplicity checking algorithm level occured.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PermutatedMinsFiller

protected PermutatedMinsFiller(long seed)
Creates the permutations.

Method Detail

createNew

public static PermutatedMinsFiller createNew(long seed)
Ensures that the right child is created according to the duplicity checking algorithm level set in Constants.CHECK_DUPLICITY_LEVEL.

Returns:
PermutatedMinsFiller

createNew

public static PermutatedMinsFiller createNew()
Ensures that the right child is created according to the duplicity checking algorithm level set in Constants.CHECK_DUPLICITY_LEVEL.

Returns:
PermutatedMinsFiller

getSeed

public long getSeed()

computeDocumentMins

public void computeDocumentMins(DocumentPermutatedMins result,
                                Sequence<Token> terms,
                                long documentUID,
                                int documentDBRevision)
                         throws DuplicityCheckingException
Computes the permutated mins values for given sequence of tokens of a document and fills it into the result under the identificator documentID.

Parameters:
result - instance of DocumentPermutatedMins to be filled with results
terms - sequence of tokens of the document
documentUID - identificator of the document
documentDBRevision - revision number of the document in the document database
Throws:
DuplicityCheckingException - if the created DocumentPermutatedMins is empty

insertMins

protected abstract void insertMins(DocumentPermutatedMins result,
                                   long documentUID,
                                   int documentDBRevision,
                                   short paragraph,
                                   short sentenceInParagraph,
                                   long[] mins)
Inserts all mins to result. After it sets fields of mins to Integer.MAX_VALUE.

Parameters:
result - DocumentPermutatedMins to which the values will be added
documentUID - identificator of the document
documentDBRevision - revision number of the document in the document database
paragraph - ordinal number of paragraph
sentenceInParagraph - ordinal number of sentence in the paragraph
mins - minimums

newUnit

protected abstract boolean newUnit(short lastParagraph,
                                   int lastSentence,
                                   short paragraph,
                                   int sentence)
Returns true, if a new unit on given duplicity checking algorithm level occured.

Parameters:
lastParagraph -
lastSentence -
paragraph -
sentence -
Returns:
true on new text unit