hadoop权威指南第三版英文版资源-CSDN文库

5星 · 超过95%的资源需积分: 10 198 浏览量 2012-08-01 10:41:41 上传评论 12 收藏 15.93MB PDF 举报

hadoop权威指南第三版发行说明：第三版会在2012年5月发行。你现在可以预定一份电子版，或购买“Early Release”版，买了这版送正式版。（这话对国人基本没用，呵呵！）下面大概说说这本书的一些改动。第三版添加了哪些新东西？第三版内容覆盖hadoop发行包1.x(原0.20版)，也包括0.22，0.23版。书中所有的例子都已经在这些版本上运行过，除了少数例外的情况，都已经在文中标注了。其实每一版的新特性都在第一章的"Hadoop Releases"描述了。这一版大部分例子用新API，由于旧版API仍在广泛使用，所以在旁注中仍然讨论它，旧版的实现代码可以在这本书的网站找到。 hadoop 0.23的主要变化是使用了new MapReduce runtime, MapReduce 2，是一个基于新的分布式资源管理系统的YARN，第六章讲如何工作，第七章讲如何应用。书中包括了更多的mapreduce资料，比如用maven打包MapReduce，设置java环境变量，写MRUnit测试单元（第五章介绍），还有一些更深入的特性，比如输出的提交，分布式缓存等（第8章），任务内存监控（第9章），第4章新增了通过mapreduce job处理avro 数据，第5章介绍了用oozie运行简单的workflow 工作流。（很遗憾没有coodenater的介绍）第3章在讲HDFS时介绍了高可用性，联合特性，及新的WebHDFS和HttpFS文件系统。 Pig, Hive, Sqoop, and ZooKeeper这几个框架的最新版的特性和修改都有扩展介绍。这本书还有许多修改和提高。原文： Third Edition The third edition is due to be published in May 2012. You can pre-order a copy, or buy the “Early Release” ebook today (you will receive the final ebook version when it is available for no extra charge). The following section is from the book’s preface, and outlines the changes in the third edition. What’s New in the Third Edition? The third edition covers the 1.x (formerly 0.20) release series of Apache Hadoop, as well as the newer 0.22 and 0.23 series. With a few exceptions, which are noted in the text, all the examples in this book run against these versions. The features in each release series are described at a high-level in "Hadoop Releases" in Chapter 1. This edition uses the new MapReduce API for most of the examples. Since the old API is still in widespread use, it continues to be discussed in the text alongside the new API, and the equivalent code using the old API can be found on the book’s website. The major change in Hadoop 0.23 is the new MapReduce runtime, MapReduce 2, which is built on a new distributed resource management system called YARN. This edition includes new sections covering MapReduce on YARN: how it works (Chapter 6) and how to run it (Chapter 9). There is more MapReduce material too, including development practices like packaging MapReduce jobs with Maven, setting the user’s Java classpath, and writing tests with MRUnit (all in Chapter 5); and more depth on features such as output committers, the distributed cache (both in Chapter 8), and task memory monitoring (Chapter 9). There is a new section on writing MapReduce jobs to process Avro data (Chapter 4), and on running a simple MapReduce workflow in Oozie (Chapter 5). The chapter on HDFS (Chapter 3) now has introductions to High Availability, Federation, and the new WebHDFS and HttpFS filesystems. The chapters on Pig, Hive, Sqoop, and ZooKeeper have all been expanded to cover the new features and changes in their latest releases. In addition, numerous corrections and improvements have been made throughout the book.

资源推荐

资源详情

资源评论

剩余685页未读，继续阅读

评论收藏

内容反馈

ZYQ2006

2012-10-26

资源介绍很好。备用，参考API的更新，将来看看。不过权威指南最好还是有一定开发经历后再看，讲得肯定比一般书深，而且还比较粗，我发现权威指南都不适合入门
栩晨

2013-05-22

正宗电子版，很清楚，这本书绝对是学习hadoop最好的材料
Data+Science+Insight

2013-10-20

英文的，权威的，Hadoop学习资料，，
zde123z123

2013-03-06

很好，是最新的
ylxsyf

2013-11-20

非常好，很清晰，适合学习！谢谢楼主分享！