org.egothor.duplicity.file
Class SimilarUnitPairsFile

java.lang.Object
  extended by org.egothor.duplicity.file.DuplicityCheckingFile
      extended by org.egothor.duplicity.file.CommonSimilarUnitPairsFile
          extended by org.egothor.duplicity.file.MergeableSimilarUnitPairsFile
              extended by org.egothor.duplicity.file.SimilarUnitPairsFile

public class SimilarUnitPairsFile
extends MergeableSimilarUnitPairsFile

Represents the "similar unit pairs" file used in duplicity checking algorithm. The file contains instances of UnitPair class. That means it contains pairs {first, second}, where first, second are identificators of units on which we check duplicity (can be document, paragraph or sentence). The file is sorted - the main criteria is first field, in case of tie second field.

The file should be used as follows.

  1. First it should be created using SimilarUnitPairsFileProducer or read from the filesystem.
  2. It can be merged with another file corresponding to the same permutation by a call to merge(org.egothor.duplicity.file.SimilarUnitPairsFile, org.egothor.duplicity.file.SimilarUnitPairsTempFile) method.

Author:
Kate�ina Dufkov�

Nested Class Summary
 
Nested classes/interfaces inherited from class org.egothor.duplicity.file.DuplicityCheckingFile
DuplicityCheckingFile.TempFile
 
Field Summary
 
Fields inherited from class org.egothor.duplicity.file.MergeableSimilarUnitPairsFile
lastLoading
 
Fields inherited from class org.egothor.duplicity.file.CommonSimilarUnitPairsFile
permID
 
Fields inherited from class org.egothor.duplicity.file.DuplicityCheckingFile
location, out
 
Constructor Summary
  SimilarUnitPairsFile(long permID, java.lang.String location)
          Initialializes the file already written to filesystem.
protected SimilarUnitPairsFile(SimilarUnitPairsFileProducer p, java.lang.String location)
          Initialializes the file by writing producer content to disk.
 
Method Summary
protected  void checkPermutation(long permID1, long permID2)
          Checks if files with given permutations can be merged.
protected  void createOut()
          Creates permanent file and sets the out field.
 java.lang.String getFilename()
          Returns the filename corresponding to this file.
 void merge(CommonSimilarUnitPairsFile supf)
          Merges two files externally, on filesystem.
 void merge(SimilarUnitPairsFile supfr, SimilarUnitPairsTempFile supft)
          Merges three files externally, on filesystem.
 
Methods inherited from class org.egothor.duplicity.file.MergeableSimilarUnitPairsFile
mergeAll, mergeAll, openInputs
 
Methods inherited from class org.egothor.duplicity.file.CommonSimilarUnitPairsFile
dump, getPermID, hasTheSameContent, remove, toString
 
Methods inherited from class org.egothor.duplicity.file.DuplicityCheckingFile
createPermOut, createTempOut, delete, dump, getLocation, getNewTempFile, getOut, hasTheSameContent, initFromProducer, openOut, remove
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

SimilarUnitPairsFile

protected SimilarUnitPairsFile(SimilarUnitPairsFileProducer p,
                               java.lang.String location)
                        throws java.io.IOException
Initialializes the file by writing producer content to disk. Sets permID field, creates a file on filesystem for this file and fills it with data from producer object. The filename is in form returned by getFilename() and will be created in the location directory. The directory must exist. The existence of the file causes an exception because this would mean rewriting of some data.

Parameters:
p - producer object
location - path and name of the directory in which the file will be created, must end with "/" sign
Throws:
java.io.IOException - if the file could not be created, the file already exists or writing data to the file failed

SimilarUnitPairsFile

public SimilarUnitPairsFile(long permID,
                            java.lang.String location)
                     throws java.io.FileNotFoundException
Initialializes the file already written to filesystem. Sets permID and location field and tries to open a file on filesystem. The filename is in form returned by getFilename() and will be searched in the location directory.

Parameters:
permID - identification of the permutation to be assigned to this file
location - path and name of the directory in which the file will be searched, must end with "/" sign
Throws:
java.io.FileNotFoundException - if the file does not exist
Method Detail

merge

public void merge(SimilarUnitPairsFile supfr,
                  SimilarUnitPairsTempFile supft)
           throws java.io.IOException,
                  MergeException
Merges three files externally, on filesystem. The files must correnspond to the same permutation.

Parameters:
supfr - regular file to be merged into this (SimilarUnitPairsFile)
supft - temporary file to be merged into this (SimilarUnitPairsTempFile)
Throws:
MergeException - on attempt to merge files corresponding to different permutations or if temporary file could not be created
java.io.IOException

merge

public void merge(CommonSimilarUnitPairsFile supf)
           throws java.io.IOException,
                  MergeException
Merges two files externally, on filesystem. The files must correnspond to the same permutation.

Parameters:
supf - file to be merged into this. Can be regular (SimilarUnitPairsFile) or temporary (SimilarUnitPairsTempFile).
Throws:
MergeException - on attempt to merge files corresponding to different permutations or if temporary file could not be created
java.io.IOException

getFilename

public java.lang.String getFilename()
Returns the filename corresponding to this file. The location and permID fields MUST be already set. The filename is created in directory given in location field and is in form Constants.SIMILAR_UNIT_PAIRS_FILE_PREFIX<permID>.

Specified by:
getFilename in class DuplicityCheckingFile

createOut

protected void createOut()
                  throws java.io.IOException
Creates permanent file and sets the out field. Uses the DuplicityCheckingFile.createPermOut() method.

Specified by:
createOut in class DuplicityCheckingFile
Throws:
java.io.IOException - if the file already exists or could not be created
See Also:
DuplicityCheckingFile.createPermOut()

checkPermutation

protected void checkPermutation(long permID1,
                                long permID2)
                         throws MergeException
Checks if files with given permutations can be merged.

Specified by:
checkPermutation in class MergeableSimilarUnitPairsFile
Parameters:
permID1 - id of first permutation
permID2 - id of second permutation
Throws:
MergeException - on attempt to merge files corresponding to different permutations