|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.egothor.Constants
public class Constants
Major constant values of the core engine.
Nested Class Summary | |
---|---|
static class |
Constants.CheckDuplicityLevel
Enum for possible levels of duplicity checking algorithm. |
Field Summary | |
---|---|
static java.lang.String |
BITTOKEN
Tag used for bitmaps stored in the index. |
static boolean |
CHECK_DUPLICITY
Sign if duplicity checking algorithm should be defaultly used. |
static java.lang.String |
CHECK_DUPLICITY_DIR
The name of directory for the duplicity checking algorithm files. |
static java.lang.String |
CHECK_DUPLICITY_INDEX_DIR
The name of directory for the duplicity checking algorithm reports. |
static Constants.CheckDuplicityLevel |
CHECK_DUPLICITY_LEVEL
Actual level of duplicity checking algorithm. |
static boolean |
CHECK_DUPLICITY_ON_NGRAMS
Sign if duplicity checking algorithm will work with N-grams of words. |
static int |
CHECK_DUPLICITY_PERM_CHUNK_BITS
Number of bits of permutation chunks used in duplicity checking algorithm. |
static int |
CHECK_DUPLICITY_PERM_CHUNKS
Number of permutation chunks that together form one logical permutation used in duplicity checking algorithm. |
static int |
CHECK_DUPLICITY_PERM_NUM
Number of permutations used in duplicity checking algorithm. |
static java.lang.String |
CHECK_DUPLICITY_REPORT_DIR
The name of directory for the duplicity checking algorithm reports. |
static java.lang.String |
CHECK_DUPLICITY_TEMP_DIR
The name of directory for the duplicity checking algorithm temporary files. |
static boolean |
CHECK_PARAGRAPHS
Sign if ParagraphPunctFilter should be used. |
static java.lang.String |
CONST_FILE_BEGINNING_POSTFIX
postfix of const files that specifie time of the request for constancy of index. |
static java.lang.String |
CONST_FILE_PREFIX
|
static java.lang.String |
DEADBARRELS_FILENAME
Name of the file where directory numbers of dead barrels are saved. |
static int |
DEFAULTMODEL
What model is used for querying. |
static int |
DOCINVSIZE
How many terms we assume in a document. |
static int |
DOCSCACHE
How many documents are cached in each barrel during querying phase? |
static double |
DUPLICATE_TRESHOLD
The treshold for mean value of Jaccard coeficient (divided by number of permutations see CHECK_DUPLICITY_PERM_NUM )
for all textual units of the document. |
static int |
FIRSTPARAGRAPH
Number of the first paragraph in a document. |
static int |
FIRSTSENTENCE
Number of the first sentence in a document. |
static long |
FIRSTUID
Number of the first document in a collection (barrel/tanker). |
static java.lang.String |
FS
File separator. |
static int |
IOSIZE
Size of the I/O buffers. |
static int |
ITEM_LENGTH_IN_TRANSACTION_LISTENER
Length of a sorted item in transaction listener log. |
static java.lang.String |
JACCARD_COEFICIENTS_FILE_NAME
The filename for the JaccardCoeficientsFile . |
static java.lang.String |
LOCAL_TANKER_COMMIT_TO_GLOBAL_LOG_FILENAME
Prefix of state file, that signals, that local tanker is in commit phase. |
static java.lang.String |
LOCAL_TANKER_DIRECTORY_PREFIX
Prefix of all local tankers. |
static long |
LOCK_RESERVATION_REFRESH_PERIOD
Refresh time of lock reservation, time after which the reservation can expire. |
static java.lang.String |
LOCK_SERVER_DEFAULT_CONFIG_FILENAME
Full filename to the lock server configuration file. |
static java.lang.String |
LS
Line separator. |
static double |
MINIDF
Minimal value of an inverse document frequency. |
static double |
MINVALIDIDF
All terms with idf that is lower are excluded automatically. |
static java.lang.String |
MODIFIER_STATE_FILENAME_PREFIX
Prefix of all modify active state filenames. |
static long |
MODIFY_STATE_REFRESH_PERIOD
Period of time after which modify state file of a modifier is refreshed. |
static long |
NO_RESERVATION_ID
Id of lock, that is returned from lock server, when no reservation was created. |
static int |
NORMFACTOR
Normalization of vectors to this... |
static int |
OCCURENCIESTOSCAN
Maximum number of positions which are scanned during phrase queries in each of the acting term occurencies. |
static char |
PARAGRAPH_SEPARATOR
Special character which determines paragraph separator. |
static int |
PARAGRAPH_SEPARATOR_WEIGHT
Special weight which determines paragraph separator. |
static java.lang.String |
PERMUTATED_MINS_FILE_PREFIX
The prefix of filename for the PermutatedMinsFile . |
static int |
PRECOMPCACHE
How many values are precomputed for an inverted list during the search phase. |
static java.lang.String |
READ_LOCK_FILENAME_PREFIX
Prefix of all read lock filenames. |
static long |
READ_LOCK_PERIOD
Default read lock refresh period. |
static boolean |
REQUIREDMODEBYDEF
Required mode in queries? (true=act as g00gle) |
static int |
SECOND
Period of time - 1 second. |
static java.lang.String |
SEPFILESEXT
What extention is used in ThickBarrel for separated inverted lists. |
static java.lang.String |
SEPTOKEN
Tag(s) used for separated inverted lists - defines the prefix. |
static java.lang.String |
SIMILAR_UNIT_PAIRS_FILE_PREFIX
The prefix of filename for the SimilarUnitPairsFile . |
static double |
SIMILARITY_RELEVANT_TRESHOLD
The treshold for Jaccard coeficient (divided by number of permutations see CHECK_DUPLICITY_PERM_NUM )
for a textual units of the document to appear as suspect in duplicity checking report. |
static int |
SKIPFACTOR
Skip factor preferably used. |
static boolean |
SUPPORTHTDIG
Support HTdig in the HTML parser. |
static int |
TANKERCACHE
Size of a cache in the TankerImpl. |
static java.lang.String |
TEMPDIR
Temporary directory. |
static int |
TERMSCACHE
How many words are cached in each barrel during querying phase? |
static int |
TITLELEN
Title length. |
static java.lang.String |
TRANSACTION_LISTENER_FILENAME_PREFIX
Prefix of all transaction listeners' filenames. |
static java.lang.String |
UNKNOWNCONTENTTYPE
Unknown content type (used by robot or indexers when they cannot obtain a valid content-type). |
static int |
WORDNGRAMS_LENGHT
The lenght of N-grams produced by the WordNGrammer filter. |
static java.lang.String |
WRITE_LOCK_FILENAME_PREFIX
Prefix of all write lock filenames. |
static long |
WRITE_LOCK_PERIOD
Default write lock refresh period. |
Constructor Summary | |
---|---|
Constants()
|
Method Summary |
---|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int NORMFACTOR
public static final int PRECOMPCACHE
public static final int DEFAULTMODEL
Query.MODEL_VECTOR
,
Query.MODEL_FUZZY_M
,
Query.MODEL_BOOLEAN
,
Query.setModel(int)
,
Constant Field Valuespublic static final int DOCINVSIZE
public static final long FIRSTUID
public static final int FIRSTSENTENCE
public static final int FIRSTPARAGRAPH
public static final int TERMSCACHE
Terms
,
Constant Field Valuespublic static final int DOCSCACHE
Documents
,
Constant Field Valuespublic static final java.lang.String SEPFILESEXT
ThickBarrel
for separated inverted lists.
Nonfunctional
public static final java.lang.String SEPTOKEN
public static final java.lang.String BITTOKEN
public static final int OCCURENCIESTOSCAN
PhraseScan
,
Constant Field Valuespublic static final double MINIDF
CWI
,
Constant Field Valuespublic static final java.lang.String FS
public static final java.lang.String LS
public static final boolean SUPPORTHTDIG
HTMLParser
,
Constant Field Valuespublic static final int TITLELEN
HTMLParser
,
Constant Field Valuespublic static final int IOSIZE
FileLocal
,
Constant Field Valuespublic static final boolean REQUIREDMODEBYDEF
public static final double MINVALIDIDF
QTerm.applyCWI(org.egothor.core.CWI)
,
Constant Field Valuespublic static final int SKIPFACTOR
IListMetadataWrite
,
Constant Field Valuespublic static final int TANKERCACHE
TankerImpl
,
Constant Field Valuespublic static final java.lang.String TEMPDIR
Directory
public static final java.lang.String UNKNOWNCONTENTTYPE
public static final java.lang.String CONST_FILE_PREFIX
public static final java.lang.String CONST_FILE_BEGINNING_POSTFIX
public static final java.lang.String LOCAL_TANKER_DIRECTORY_PREFIX
public static final java.lang.String DEADBARRELS_FILENAME
public static final java.lang.String READ_LOCK_FILENAME_PREFIX
public static final java.lang.String WRITE_LOCK_FILENAME_PREFIX
public static final java.lang.String TRANSACTION_LISTENER_FILENAME_PREFIX
public static final java.lang.String MODIFIER_STATE_FILENAME_PREFIX
public static final java.lang.String LOCAL_TANKER_COMMIT_TO_GLOBAL_LOG_FILENAME
public static final int ITEM_LENGTH_IN_TRANSACTION_LISTENER
public static final java.lang.String LOCK_SERVER_DEFAULT_CONFIG_FILENAME
public static final long READ_LOCK_PERIOD
public static final long WRITE_LOCK_PERIOD
public static final long NO_RESERVATION_ID
public static final int SECOND
public static final long MODIFY_STATE_REFRESH_PERIOD
public static final long LOCK_RESERVATION_REFRESH_PERIOD
public static final char PARAGRAPH_SEPARATOR
public static final int PARAGRAPH_SEPARATOR_WEIGHT
public static final int WORDNGRAMS_LENGHT
WordNGrammer
filter.
public static final java.lang.String PERMUTATED_MINS_FILE_PREFIX
PermutatedMinsFile
.
public static final java.lang.String SIMILAR_UNIT_PAIRS_FILE_PREFIX
SimilarUnitPairsFile
.
public static final java.lang.String JACCARD_COEFICIENTS_FILE_NAME
JaccardCoeficientsFile
.
public static final java.lang.String CHECK_DUPLICITY_DIR
public static final java.lang.String CHECK_DUPLICITY_TEMP_DIR
public static final java.lang.String CHECK_DUPLICITY_REPORT_DIR
public static final java.lang.String CHECK_DUPLICITY_INDEX_DIR
public static final boolean CHECK_DUPLICITY_ON_NGRAMS
public static Constants.CheckDuplicityLevel CHECK_DUPLICITY_LEVEL
public static final int CHECK_DUPLICITY_PERM_CHUNK_BITS
public static final int CHECK_DUPLICITY_PERM_CHUNKS
public static int CHECK_DUPLICITY_PERM_NUM
public static final boolean CHECK_DUPLICITY
public static final boolean CHECK_PARAGRAPHS
ParagraphPunctFilter
should be used.
public static double DUPLICATE_TRESHOLD
CHECK_DUPLICITY_PERM_NUM
)
for all textual units of the document.
Above this value the document is considered duplicate.
public static final double SIMILARITY_RELEVANT_TRESHOLD
CHECK_DUPLICITY_PERM_NUM
)
for a textual units of the document to appear as suspect in duplicity checking report.
Above this value the diff algorithm with the most similar document from the index
is made for this textual unit.
Constructor Detail |
---|
public Constants()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |