Hadoop 平台技术研究
目录
1 目标..........................................................................................................................................................4
2 Spark 简介................................................................................................................................................4
2.1 Spark 组成.........................................................................................................................................4
2.2 Spark 运行模式.................................................................................................................................5
2.3 Spark 支持的编程语言......................................................................................................................6
2.4 Spark 架构.........................................................................................................................................6
2.5 Spark 任务提交模式及对应的工作流程..........................................................................................7
2.5.1 client 模式...................................................................................................................................7
2.5.2 cluster 模式.................................................................................................................................7
2.5.3 总结............................................................................................................................................8
3 Spark 专业术语........................................................................................................................................9
3.1 RDD(弹性分布式数据集).............................................................................................................9
3.2 广播变量(Broadcast Variables)..................................................................................................11
3.3 累加器.............................................................................................................................................12
3.4 窄依赖 and 宽依赖..........................................................................................................................12
3.5 Stage 划分.......................................................................................................................................13
3.6 Shu#e.............................................................................................................................................. 13
4 spark 数据倾斜.......................................................................................................................................14
4.1 现象:................................................................................................................................................14
4.2 原因分析:.....................................................................................................................................14
4.3 解决办法:........................................................................................................................................15
5 Spark 调优方案.....................................................................................................................................16
6 集群 hbase 迁移方案.............................................................................................................................18
7 附录 1-spark 常见调优参数说明...........................................................................................................22
5.1 spark.shu#e.&le.bu'er....................................................................................................................22
5.2 spark.reducer.maxSizeInFlight.........................................................................................................22
5.3 spark.shu#e.io.maxRetries..............................................................................................................23
5.4 spark.shu#e.io.retryWait................................................................................................................23
5.5 spark.shu#e.memoryFrac/on.........................................................................................................23
评论0
最新资源