目录
Java 开发网络爬虫.............................................................................................................................1
第 1 章 Java 基础................................................................................................................................3
1.1 第一个程序...........................................................................................................................3
1.2 准备开发环境.......................................................................................................................4
1.2.1 JDK.............................................................................................................................4
1.2.2 Eclipse........................................................................................................................5
1.3 类和对象...............................................................................................................................5
1.4 常量.......................................................................................................................................6
1.5 命名规范...............................................................................................................................7
1.6 基本语法...............................................................................................................................7
1.7 条件判断...............................................................................................................................8
1.8 循环.......................................................................................................................................8
1.9 数组.....................................................................................................................................10
1.10 位运算...............................................................................................................................12
1.11 枚举类型...........................................................................................................................13
1.12 比较器...............................................................................................................................14
1.13 方法...................................................................................................................................15
1.14 集合类...............................................................................................................................15
1.14.1 动态数组................................................................................................................16
1.14.2 散列表....................................................................................................................16
1.15 泛型...................................................................................................................................18
1.16 多线程...............................................................................................................................18
1.16.1 基本的多线程........................................................................................................19
1.16.2 线程池....................................................................................................................19
1.17 处理图片...........................................................................................................................20
1.18 本章小结...........................................................................................................................20
第 2 章 网络爬虫入门......................................................................................................................21
2.1 获取信息.............................................................................................................................21
2.2 各种网络爬虫.....................................................................................................................21
2.2.1 信息采集器..............................................................................................................23
2.2.2 广度优先遍历..........................................................................................................24
2.2.3 分布式爬虫..............................................................................................................25
2.3 爬虫相关协议.....................................................................................................................26
2.3.1 网站地图..................................................................................................................27
2.3.2 Robots 协议..............................................................................................................28
2.4 爬虫架构.............................................................................................................................28
2.4.1 基本架构..................................................................................................................28
2.4.2 分布式爬虫架构......................................................................................................30
2.4.3 垂直爬虫架构..........................................................................................................31
2.5 自己写网络爬虫.................................................................................................................32
2.6 URL 地址查新....................................................................................................................34
2.6.1 嵌入式数据库..........................................................................................................34
2.6.2 布隆过滤器..............................................................................................................36
2.6.3 实现布隆过滤器......................................................................................................37
- -