NWebCrawler: A Stupid C# Web Crawler
Overview
实现 NWebCrawler V1.0.2.
[done]
Initialize frontier with seed URLs
Check for termination
[not done]
Pick URL from frontier
Fetch page
Parse page
Add URLs to frontier
[URL]
[no URL]
Crawling loop
Design
1.0.2 解决老版本(V1.0.0)的三个问题:
Design
NWebCrawler V 1.0.0 implements a basic multi-threaded crawler. The crawler maintains a list of
unvisited URLs called the frontier. The list is initialized with seed URLs provided by a user. Each crawling
评论0