org.egothor.duplicity.file
Class PermutatedMinsFile

java.lang.Object
  extended by org.egothor.duplicity.file.DuplicityCheckingFile
      extended by org.egothor.duplicity.file.PermutatedMinsFile

public class PermutatedMinsFile
extends DuplicityCheckingFile

Represents the "minimums of permutated unit identificators" file used in duplicity checking algorithm. The file contains instances of UnitPermutatedMin class. That means it contains pairs {min(pi(T(d))), d}, where:

The file is sorted - the main criteria is min field, in case of tie unitID field.

The file should be used as follows.
  1. First it should be created using PermutatedMinsFileProducer or read from the filesystem.
  2. It can be merged with another file corresponding to the same permutation by a call to merge(org.egothor.duplicity.file.PermutatedMinsFile) method.

Author:
Kate�ina Dufkov�

Nested Class Summary
 
Nested classes/interfaces inherited from class org.egothor.duplicity.file.DuplicityCheckingFile
DuplicityCheckingFile.TempFile
 
Field Summary
 
Fields inherited from class org.egothor.duplicity.file.DuplicityCheckingFile
location, out
 
Constructor Summary
PermutatedMinsFile(long permID, java.lang.String location)
          Initialializes the file already written to filesystem.
 
Method Summary
protected  void createOut()
          Creates permanent file and sets the out field.
 SimilarUnitPairsFile createSimilarUnitPairsFile()
          From this files, creates the SimilarUnitPairsFile corresponding to it.
 java.lang.String dump()
          Dumps the content of the file to string.
 java.lang.String getFilename()
          Returns the filename corresponding to this file.
 long getPermID()
           
 SimilarUnitPairsTempFile getSimilarities(PermutatedMinsFile mpuf)
          Merges two files externally, on filesystem.
 boolean hasTheSameContent(DuplicityCheckingFile file)
          Checks if two files has the same content.
 void merge(PermutatedMinsFile mpuf)
          Merges two files externally, on filesystem.
 void remove(java.util.Set<DocumentUnitID> toRemove)
          Removes all occurences of documents given in the set from the file.
 java.lang.String toString()
           
 
Methods inherited from class org.egothor.duplicity.file.DuplicityCheckingFile
createPermOut, createTempOut, delete, dump, getLocation, getNewTempFile, getOut, hasTheSameContent, initFromProducer, openOut, remove
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

PermutatedMinsFile

public PermutatedMinsFile(long permID,
                          java.lang.String location)
                   throws java.io.FileNotFoundException
Initialializes the file already written to filesystem. Sets permID and location fields and checks if a file on filesystem exists for this file. The filename is in form returned by getFilename() and will be searched in the location directory.

Parameters:
permID - identification of the permutation to be assigned to this file
location - path and name of the directory in which the file will be created, must end with "/" sign
Throws:
java.io.FileNotFoundException - if the file already exists
Method Detail

getFilename

public java.lang.String getFilename()
Returns the filename corresponding to this file. The location and permID fields MUST be already set. The filename is created in directory given in location field and is in form Constants.PERMUTATED_MINS_FILE_PREFIX<permID>.

Specified by:
getFilename in class DuplicityCheckingFile

getPermID

public long getPermID()

createOut

protected void createOut()
                  throws java.io.IOException
Creates permanent file and sets the out field. Uses the DuplicityCheckingFile.createPermOut() method.

Specified by:
createOut in class DuplicityCheckingFile
Throws:
java.io.IOException - if the file already exists or could not be created
See Also:
DuplicityCheckingFile.createPermOut()

createSimilarUnitPairsFile

public SimilarUnitPairsFile createSimilarUnitPairsFile()
From this files, creates the SimilarUnitPairsFile corresponding to it. It will contains all the pairs of units, that have the same minimum.

Returns:
instance of SimilarUnitPairsFile written to the filesystem.

merge

public void merge(PermutatedMinsFile mpuf)
           throws java.io.IOException,
                  MergeException
Merges two files externally, on filesystem. The files must correnspond to the same permutation.

Parameters:
mpuf - file to be merged into this
Throws:
MergeException - on attempt to merge files corresponding to different permutations or if temporary file could not be created
java.io.IOException

getSimilarities

public SimilarUnitPairsTempFile getSimilarities(PermutatedMinsFile mpuf)
                                         throws java.io.IOException,
                                                MergeException
Merges two files externally, on filesystem. The files must correnspond to the same permutation.

Parameters:
mpuf - file to be merged into this
Returns:
SimilarUnitPairsFile with similarities between the duplicity checking units in the merged files, according to permutation with identificator this.permID. Contains all pairs od unitIDs from different merged files, where their mins are equal
Throws:
MergeException - on attempt to merge files corresponding to different permutations or if temporary file could not be created
java.io.IOException

toString

public java.lang.String toString()
Overrides:
toString in class DuplicityCheckingFile

dump

public java.lang.String dump()
Dumps the content of the file to string.

Returns:
string representation of the file including its content

remove

public void remove(java.util.Set<DocumentUnitID> toRemove)
            throws java.io.IOException
Removes all occurences of documents given in the set from the file.

Parameters:
toRemove - set of document ids to remove
Throws:
java.io.IOException

hasTheSameContent

public boolean hasTheSameContent(DuplicityCheckingFile file)
Checks if two files has the same content.

Parameters:
file - the second file to be tested
Returns:
true, if the file contents are the same