目 录
第一章 引言 ....................................................................................................................................1
1.1 课题研究的背景及意义 ...................................................................................................1
1.2 课题研究的内容 ...............................................................................................................1
1.3 可行性分析 .......................................................................................................................1
1.3.1 技术可行性 ............................................................................................................1
1.3.2 社会可行性 ............................................................................................................1
1.4 论文结构 ...........................................................................................................................1
第二章 相关理论与技术介绍 ........................................................................................................2
2.1 网络爬虫 ...........................................................................................................................2
2.2 分布式网络爬虫 ................................................................................................................2
2.2.1 分布式爬虫系统架构 .............................................................................................2
2.3 Scrapy 框架的研究.............................................................................................................4
2.3.1 Scrapy 框架结构......................................................................................................4
2.3.2 Scrapy 框架的不足..................................................................................................6
2.3.3 Scrapy-Redis.............................................................................................................6
2.4 相关技术介绍 ...................................................................................................................7
2.4.1 scrapyd.....................................................................................................................7
2.4.2 spiderkeeper ............................................................................................................8
2.4.3 redis 数据库 ............................................................................................................8
第三章 需求分析 ............................................................................................................................8
3.1 功能性需求分析 ...............................................................................................................8
3.1.1 数据采集 ................................................................................................................8
3.1.2 数据存储 ................................................................................................................9
3.2 非功能性需求分析 ...........................................................................................................9
3.2.1 健壮性 ....................................................................................................................9
3.2.2 尊重性 ....................................................................................................................9
3.2.3 灵活性 ....................................................................................................................9
第四章 爬虫系统总体设计 ............................................................................................................9
4.1 设计目标 ...........................................................................................................................9
4.2 爬虫系统总体设计 ..........................................................................................................10
第五章 爬虫系统详细设计与实现...............................................................................................11
5.1 采集数据 .........................................................................................................................11
5.2 疫情数据爬虫设计与实现 ..............................................................................................11
5.2.1 数据来源 ..............................................................................................................11
5.2.2 定义 item 容器.....................................................................................................11
5.2.3 编写 spider ...........................................................................................................12
5.2.4 编写 pipelines 管道文件......................................................................................13
5.3 谣言数据爬虫设计与实现 .............................................................................................14
5.3.1 数据来源 ..............................................................................................................14