下载  >  服务器应用  >  群集服务  > Hadoop_in_Action.pdf

Hadoop_in_Action.pdf 评分

Hadoop_in_Action.pdf
C0010 Action CHUCK LAM MANNING Greenwich (74 W long. For online information and ordering of this and other Manning books, please visit www.manning.com.Thepublisheroffersdiscountsonthisbookwhenorderedinquantity For more information, please contact Special sales department Manning publications co 180 Broad st Suite 1323 Stamford CT 06901 Emailorders@manning.com o201l by Manning Publications co. all rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. @o Recognizing the importance of preserving what has been written, it is Manning,'s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of p elemental chlorine Manning publications Co Development editor: Cynthia Kane 1 80 Broad st Copyeditor: Composure Graphics Suite 1323 Proofreader: Katie tennant Stamford. cT06901 Composition: Composure Graphics Cover designer: Marija Tudor ISBN:9781935182191 Printed in the united states of america 12345678910-MAL-151413121110 brief contents PART HADOOP-A DISTRIBUTED PROGRAMMING FRAMEWORK.6ss5s668566658856861 Introducing hadoop 3 2■ Starting Hadoop2l 8 Components of Hadoop 37 PART I HADOOP N ACTION 0000000000000000000000000000000000 61 4 Writing basic MapReduce programs 63 5 Advanced Mapreduce 102 6 Programming Practices 134 Cookbook 160 8■ Managing Hadoop173 PART I HADOOP GONE WILD 191 Q Running hadoop in the cloud 193 10 Programming with Pig 212 Hive and the hadoop herd 246 12 Case studies 266 contents prefe acknowledgments xu about this book xvii Author Online xix About the author xx About the cover illustration xxi PARTI HADOOP-A DISTRIBUTED PROGRAMMING FRAMEWORK Introducing Hadoop 3 1.1 Why"Hadoop in Action"? 4 1. 2 What is Hadoop? 4 1. 3 Understanding distributed systems and Hadoop 6 1.4 Comparing SQL databases and Hadoop 7 1.5 Understanding map reduce 8 Scaling a simple program manually ga Scaling the same program in Mapreduce 12 1.6 Counting words with Hadoop--running your first program 14 1.7 History of Hadoop 19 CONTENTS 1. 8 Summary 20 1.9 Resources 20 2 Starting Hadoop 21 2.1 The building blocks of Hadoop 2 Namenode 22 Datanode 22. Secondary NameNode 23 JobTracker 24- Task Tracker 24 2.2 Setting up SSH for a Hadoop cluster 25 Define a common account 26 Verify ssh installation 26 Generate ssh key pair 26. Distribute public key and validate logins 27 2.3 Running Hadoop 27 Local (standalone) mode 28 Pseudo-distributed mode 2g. Fully distributed mode 31 2.4 Web-based cluster UI 34 2.5 Summary 36 3 Components of Hadoop 37 8. 1 Working with files in HDFs 38 Basic file commands 38 Reading and writing to hDFS programmatically 42 3.2 Anatomy of a Mapreduce program 44 Had0 b data types46· Mapper47· Reducer48· Partitioner-redirecting output from Mapper 49. Combiner-local reduce 50- Word counting with predefined mapper and reducer classes 51 8.8 Reading and writing 51 InputFormat 52. OutputFormat 57 3.4 Summary 58 PART I HADOOP IN ACTION 00000000000000000000000.61 4 Writing basic MapReduce programs 63 4.1 Getting the patent data set 64 The patent citation data 64 The patent description data 65 4.2 Constructing the basic template of a MapReduce program 67 4.3 Counting things 72 4. 4 Adapting for Hadoop's API changes 77 4.5 Streaming in Hadoop 80 Streaming with Unix commands 81 Streaming with scripts 82 Streaming with key/value pairs 86 Streaming with the Aggregate package 90 CONTENTS 4.6 Improving performance with combiners 95 4.7 Exercising what youve learned 98 4. 8 Summary 100 4.9 Further resources 101 Advanced MapReduce 102 5. 1 Chaining MapReduce jobs 103 Chaining Mapreduce jobs in a sequence 103 Chaining Mapreduce jobs with complex dependency 103 Chaining preprocessing and postprocessing steps 104 5.2 Joining data from different sources 107 Reduce-side joining 108 Replicated joins using Distributed cache 117 Semijoin: reduce-side join with map-side filtering 127 5.8 Creating a Bloom filter 122 What does a Bloom filter do? 122 Implementing a Bloom filter 124 Bloom filter in Hadoop version 0. 20+ 131 5.4 Exercising what youve learned 137 5.5 Summar y132 5.6 Further resources 133 Programming Practices 134 6.1 Developing MapReduce programs 135 Local mode 135. Pseudo-distributed mode 140 6.2 Monitoring and debugging on a production cluster 145 Counters 146 Skipping bad records 148 Rerunning failed tasks with Isolation Runner 151 6.8 Tuning for performance 152 Reducing network traffic with combiner 152 Reducing the amount of input data 152 Using compression 153 Reusing the vM 155. Running with speculative execution 156. Refactoring code and rewriting algorithms 157 6.4 Summary 158 Cookbook 160 7.1 Passing job-specific parameters to your tasks 160 7.2 Probing for task-specific information 163 7.3 Partitioning into multiple output files 164 7.4 Inputting from and outputting to a database 169 7.5 Keeping all output in sorted order 171 7.6 Summary 172

...展开详情
所需积分/C币:10 上传时间:2014-11-30 资源大小:2.5MB
举报 举报 收藏 收藏
分享 分享

评论 下载该资源后可以进行评论 1

fuzhoudejzh 不错不错,谢谢!
2019-01-17
回复
[Hadoop实战].(Hadoop.in.Action).Chuck.Lam.文字版.pdf

[Hadoop实战].(Hadoop.in.Action).Chuck.Lam.文字版

立即下载
Manning.Hadoop.in.Action.Dec.2010

Manning.Hadoop.in.Action.Dec.2010.pdf 详细介绍Hadoop

立即下载
Hadoop in action中文版.pdf

从个人经历看,学习这些技术最大的障碍出现在学习过程的中段。开始时,很容易找到引导 性的博客和演示文稿,它们会教你如何做一个“Hello World”的示例。当足够熟悉之后,你就会 知道如何在邮件列表中提问,在大小会议中邂逅专家,甚至自己阅读源代码。但在这中间存在一 个巨大的知识落差,你的胃口更大了,但又不太清楚下一步该问什么问题。对Hadoop这种最新的 技术而言尤为如此。需要一个有组织的说明,将你从开始的“Hello World”引领到可以从容地在 实践中应用Hadoop。这就是我希望本书所做到的。幸好我发现了Manning出版社的In Action系列 丛书,它们正与此目标相吻合,而且出版

立即下载
Hadoop实战.pdf Hadoop in action

Hadoop实战.pdf Hadoop in action中文版

立即下载
Manning Hadoop In Action 2010.pdf

介绍hadoop, MapReduce程序开发,Pig和Hive的使用等;

立即下载
Spark.GraphX.in.Action.2016.6.pdf

What can graphs—the things with edges and vertices, not the things with axes and tick marks—do and how can they be used with Spark? These are the questions we try to answer in this book. Frequently it is said, “Graphs can do anything,” or at least, “There are a bunch of different things you can do w

立即下载
Hadoop实战 韩继忠 译.pdf

hadoop in action 中文版,韩继忠译

立即下载
Hadoop实战(Hadoop in Action)

[Hadoop实战].(Hadoop.in.Action).Chuck.Lam.文字版.pdf

立即下载
Hadoop实战.pdf

Hadoop实战.pdf 中文版(英文名:Hadoop.in.Action)

立即下载
Hadoop in Action

Hadoop in Action.pdf

立即下载
hadoop pdf

Hadoop in Action.pdf Hadoop Real World Solutions Cookbook.pdf mapreduce_design_patterns.pdf The.Definitive.Guide.3rd.Edition.pdf 四本书,全部英文版

立即下载
Hadoop实战(Hadoop in Action)

Hadoop实战(Hadoop in Action)。英文清晰PDF版。揭开云计算的神秘面纱 海量数据分布式处理框架。

立即下载
Hadoop In Action2

Hadoop In Action 中文第二版 卷二 rar

立即下载
Hadoop英文技术资料pdf-经典啊

有三个PDF, Apress+-+Pro+Hadoop.pdf, Hadoop_in_Action.pdf, The+Definitive+Guide.pdf

立即下载
[Mahout.in.Action(2011)].Sean.Owen.文字版.pdf

Mahout in Action 英文版全书415页。 Mahout是一个Apache的开源机器学习项目。该算法属于广阔的 “机器学习”,或“集体智慧的伞形结构。这就可以代表很多东西,但此时此刻,我们关心Mahout的主要部分是:协同过滤(CF)/推荐引擎(recommender),聚类(clustering)和分类(classification)。 它具有很强的扩展性。当被处理的非常巨大的数据量,对单个机器来说可能太巨大以至于无法完成时,Mahout旨在成为处理数据的机器学习工具。在它当前的体现中,这些可扩展性的实现是用Java语言写成的,有些部分是建立在Apache Hadoop分布式

立即下载
Hadoop资料

Hadoop FAQ.doc hadoop2.0.ppt hadoop_conf.rar Hadoop实战.Hadoop.in.Action.Chuck.Lam.文字版.pdf Hadoop实战.源代码.zip hadoop权威指南_中文版_带目录索引.pdf Hadoop权威指南_原版.pdf Hadoop权威资料 源代码.rar Hadoop源代码eclipse编译教程.pdf HBase:权威指南.docx HDFS.ppt 11 个文件 86,726,174 字节

立即下载
Hadoop实战

[Hadoop实战].(Hadoop.in.Action).pdf很实用的hadoop学习资料,英文原版,原滋原味 ;深恶痛绝要积分下载,本人资源全部免费

立即下载
mathout in action (pdf)

mathout in action 花了200多在国外买的,作者写了一半,有对 hadoop平台做数据分析 数据挖掘的同时可以下载

立即下载
ModbusTCP/RTU网关设计

基于UIP协议栈,实现MODBUS联网,可参考本文档资料,有MODBUS协议介绍

立即下载
html+css+js制作的一个动态的新年贺卡

该代码是http://blog.csdn.net/qq_29656961/article/details/78155792博客里面的代码,代码里面有要用到的图片资源和音乐资源。

立即下载