org.egothor.parser.filter
Class Stemmer

java.lang.Object
  extended by org.egothor.core.Filter
      extended by org.egothor.parser.filter.Stemmer
All Implemented Interfaces:
Sequence<Token>

public final class Stemmer
extends Filter

The Stemmer object is a filter which transforms all words to their respective stems.

Author:
Leo Galambos

Field Summary
 
Fields inherited from class org.egothor.core.Filter
prev
 
Constructor Summary
Stemmer(Sequence<Token> prev, Trie stemmer)
          Construct a Stem object using the given stemmer table.
 
Method Summary
 Token action(Token t)
          A simple stemming algorithm which works as follows:
 void setStemmer(Trie stemmer)
          Sets the stemmer attribute of the Stem object
 
Methods inherited from class org.egothor.core.Filter
getPrevTokenizer, next, setPrevTokenizer
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Stemmer

public Stemmer(Sequence<Token> prev,
               Trie stemmer)
Construct a Stem object using the given stemmer table.

Parameters:
prev - this filter's Tokenizer
stemmer - the stemmer table
Method Detail

setStemmer

public void setStemmer(Trie stemmer)
Sets the stemmer attribute of the Stem object

Parameters:
stemmer - The new stemmer value

action

public Token action(Token t)
A simple stemming algorithm which works as follows:

If the token is <WORD>, and the text of the token is longer than 4 characters, then the token's text is transformed to its stem. The transformation will not be done when the stem is shorter than three characters.

Overrides:
action in class Filter
Parameters:
t - the Token
Returns:
the transformed Token