Package org.egothor.parser.filter

This package defines objects that filter tokens.

See:
          Description

Class Summary
DoubleMetaphone This module implements a "sounds like" algorithm developed by Lawrence Philips which he published in the June, 2000 issue of C/C++ Users Journal.
DupWithoutDiacritics This filter transforms all (Latin) words to non-diacritical (ASCII), but still keeps the original tokens.
Grammer This class is really grammer - it produces N-grams.
LowerCase This filter transforms all words to lower case.
Nysiis This module implements the New York State Identification and Intelligence System (NYSIIS) Phonetic Code.
ParagraphFilter Filter sets the sentence, paragraph and sentenceInParagraph fields in the Token class, just like the ParagraphPunctFilter.
ParagraphPunctFilter Filter sets the sentence, paragraph and sentenceInParagraph fields in the Token class.
Phonetics  
PunctFilter  
RemoveDiacritics This filter transforms all (Latin) words to non-diacritical (ASCII).
Stemmer The Stemmer object is a filter which transforms all words to their respective stems.
StopFilter This abstract class should be extended by any class wishing to ignore certain tokens while processing all tokens.
WordNGrammer This class produces N-grams of words.
 

Package org.egothor.parser.filter Description

This package defines objects that filter tokens. They are used when you want to transform tokens to their stems or - for example - to lower case characters.