org.egothor.parser.filter
Class RemoveDiacritics

java.lang.Object
  extended by org.egothor.core.Filter
      extended by org.egothor.parser.filter.RemoveDiacritics
All Implemented Interfaces:
Sequence<Token>

public final class RemoveDiacritics
extends Filter

This filter transforms all (Latin) words to non-diacritical (ASCII).

Author:
Leo Galambos

Field Summary
 
Fields inherited from class org.egothor.core.Filter
prev
 
Constructor Summary
RemoveDiacritics(Sequence<Token> prev)
          Constructor for the Diacritics object.
 
Method Summary
 Token next()
          If the name/type of the token is <WORD> then transform the text of the token to lower case.
 
Methods inherited from class org.egothor.core.Filter
action, getPrevTokenizer, setPrevTokenizer
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RemoveDiacritics

public RemoveDiacritics(Sequence<Token> prev)
Constructor for the Diacritics object.

Parameters:
prev - this filter's Tokenizer
Method Detail

next

public Token next()
If the name/type of the token is <WORD> then transform the text of the token to lower case. In other cases the token remains untouched.

Specified by:
next in interface Sequence<Token>
Overrides:
next in class Filter
Returns:
the lower case Token