HOME

EGOTHOR.ORG
DEMO
KNOWN ISSUES

DOCS


DOWNLOAD

AUTHORS

SourceForge Logo

Valid XHTML 1.0!

Valid CSS!

What is Egothor?

Egothor is an Open Source, high-performance, full-featured text search engine written entirely in Java. It is technology suitable for nearly any application that requires full-text search, especially cross-platform. It can be configured as a standalone engine, metasearcher, peer-to-peer HUB, and, moreover, it can be used as a library for an application that needs full-text search.

Key features of egothor

  • Written in JAVA for cross platform compatibility.
  • Able to recognize many of the most familiar file formats: HTML, PDF, PS, and Microsoft's DOC, and XLS.
  • High capacity robot which supports robots.txt recommendation.
  • The best compression methods are used, i.e. Golomb, Elias-Gamma, Block coding.
  • Based on the extended Boolean model which can operate as the Vector or Boolean models.
  • Universal stemmer that processes any language.
  • New dynamization algorithm for fast index updating.

Design

ipc
(not used yet)
web application
(JSP)
robot
(Capek)
kernel
(search engine)
msft
(RTF, DOC, XLS)
pdf
(PDF, PS)
xml
(XML)
parser