Class Generator

  extended by org.egothor.text.Generator
All Implemented Interfaces:

public class Generator
extends java.lang.Object
implements Sequence<Token>

This class generates Tokenizer-s (documents) which reflect the Zipf's law. Words are numbers. The documents may repeat a word k-times, where k is a random number 1-9. It implies that a word appears 5 times approximately. When we want to generate documents with an average length of L words, then we prepare the Tokenizer this way: 1) L/5 unique words are prepared according to Zipf's law; 2) duplicities are generated and the words are shuffled.

Constructor Summary
Generator(int words, int L)
Method Summary
static void main(java.lang.String[] args)
 Token next()
          Return the next item in the iteration.
 void refresh()
Constructor Detail


public Generator(int words,
                 int L)
Method Detail


public void refresh()


public Token next()
Description copied from interface: Sequence
Return the next item in the iteration.

Specified by:
next in interface Sequence<Token>
the item


public static void main(java.lang.String[] args)