ABSTRACT
Internet information retrieval system (search engine) is designed to
provide a platform for information retrieval services.It will collect a lot of
pages data on the Internet to the server,and processed form of the
information database and index database.Made to achieve the user to
respond to the various information retrieval.
The system uses Microsoft Visual Studio 2005 as the main
development tool, to run Windows Server 2003 operating system
environment, the main achievement of the web crawl data, web data storage,
data indexing, data retrieval, logging management and other functions.
In this paper, several Internet information retrieval system design and
implementation of key technologies were studied. Theory on these key
technologies are discussed in detail, and completed the Internet information
retrieval system based on Lucene.net realization. The article discussed the
following aspects:
First of all, the article describes the search engine market demand and
research status.This part discusses the search engine rich historical
background and objective of the user requirements, its own characteristics,
as well as people paid more attention to search engine.
Secondly, the article discusses the basic structure of search engines, to
achieve the theoretical basis and implementation methods. This part of the
search engine's key technology, Chinese word segmentation, data
acquisition and data indexing technology combine organic, and full-text
search engine Lucene.net on analysis and research.
Finally, a detailed description of an Internet-based Lucene.net
Information Retrieval System Design and Implementation.
Keywords Search Engine;Lucene.net;Data Storage;Information Retrieval