下载  >  开发技术  >  Java  > Hadoop In Action

Hadoop In Action 评分

Hadoop In Action Hadoop实战 MapReduce编程
C0010 Action CHUCK LAM MANNING Greenwich (74 W long. For online information and ordering of this and other Manning books, please visit www.manning.com.Thepublisheroffersdiscountsonthisbookwhenorderedinquantity For more information, please contact Special sales department Manning publications co 180 Broad st Suite 1323 Stamford CT 06901 Emailorders@manning.com o201l by Manning Publications co. all rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. @o Recognizing the importance of preserving what has been written, it is Manning,'s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of p elemental chlorine Manning publications Co Development editor: Cynthia Kane 1 80 Broad st Copyeditor: Composure Graphics Suite 1323 Proofreader: Katie tennant Stamford. cT06901 Composition: Composure Graphics Cover designer: Marija Tudor ISBN:9781935182191 Printed in the united states of america 12345678910-MAL-151413121110 brief contents PART HADOOP-A DISTRIBUTED PROGRAMMING FRAMEWORK.6ss5s668566658856861 Introducing hadoop 3 2■ Starting Hadoop2l 8 Components of Hadoop 37 PART I HADOOP N ACTION 0000000000000000000000000000000000 61 4 Writing basic MapReduce programs 63 5 Advanced Mapreduce 102 6 Programming Practices 134 Cookbook 160 8■ Managing Hadoop173 PART I HADOOP GONE WILD 191 Q Running hadoop in the cloud 193 10 Programming with Pig 212 Hive and the hadoop herd 246 12 Case studies 266 contents prefe acknowledgments xu about this book xvii Author Online xix About the author xx About the cover illustration xxi PARTI HADOOP-A DISTRIBUTED PROGRAMMING FRAMEWORK Introducing Hadoop 3 1.1 Why"Hadoop in Action"? 4 1. 2 What is Hadoop? 4 1. 3 Understanding distributed systems and Hadoop 6 1.4 Comparing SQL databases and Hadoop 7 1.5 Understanding map reduce 8 Scaling a simple program manually ga Scaling the same program in Mapreduce 12 1.6 Counting words with Hadoop--running your first program 14 1.7 History of Hadoop 19 CONTENTS 1. 8 Summary 20 1.9 Resources 20 2 Starting Hadoop 21 2.1 The building blocks of Hadoop 2 Namenode 22 Datanode 22. Secondary NameNode 23 JobTracker 24- Task Tracker 24 2.2 Setting up SSH for a Hadoop cluster 25 Define a common account 26 Verify ssh installation 26 Generate ssh key pair 26. Distribute public key and validate logins 27 2.3 Running Hadoop 27 Local (standalone) mode 28 Pseudo-distributed mode 2g. Fully distributed mode 31 2.4 Web-based cluster UI 34 2.5 Summary 36 3 Components of Hadoop 37 8. 1 Working with files in HDFs 38 Basic file commands 38 Reading and writing to hDFS programmatically 42 3.2 Anatomy of a Mapreduce program 44 Had0 b data types46· Mapper47· Reducer48· Partitioner-redirecting output from Mapper 49. Combiner-local reduce 50- Word counting with predefined mapper and reducer classes 51 8.8 Reading and writing 51 InputFormat 52. OutputFormat 57 3.4 Summary 58 PART I HADOOP IN ACTION 00000000000000000000000.61 4 Writing basic MapReduce programs 63 4.1 Getting the patent data set 64 The patent citation data 64 The patent description data 65 4.2 Constructing the basic template of a MapReduce program 67 4.3 Counting things 72 4. 4 Adapting for Hadoop's API changes 77 4.5 Streaming in Hadoop 80 Streaming with Unix commands 81 Streaming with scripts 82 Streaming with key/value pairs 86 Streaming with the Aggregate package 90 CONTENTS 4.6 Improving performance with combiners 95 4.7 Exercising what youve learned 98 4. 8 Summary 100 4.9 Further resources 101 Advanced MapReduce 102 5. 1 Chaining MapReduce jobs 103 Chaining Mapreduce jobs in a sequence 103 Chaining Mapreduce jobs with complex dependency 103 Chaining preprocessing and postprocessing steps 104 5.2 Joining data from different sources 107 Reduce-side joining 108 Replicated joins using Distributed cache 117 Semijoin: reduce-side join with map-side filtering 127 5.8 Creating a Bloom filter 122 What does a Bloom filter do? 122 Implementing a Bloom filter 124 Bloom filter in Hadoop version 0. 20+ 131 5.4 Exercising what youve learned 137 5.5 Summar y132 5.6 Further resources 133 Programming Practices 134 6.1 Developing MapReduce programs 135 Local mode 135. Pseudo-distributed mode 140 6.2 Monitoring and debugging on a production cluster 145 Counters 146 Skipping bad records 148 Rerunning failed tasks with Isolation Runner 151 6.8 Tuning for performance 152 Reducing network traffic with combiner 152 Reducing the amount of input data 152 Using compression 153 Reusing the vM 155. Running with speculative execution 156. Refactoring code and rewriting algorithms 157 6.4 Summary 158 Cookbook 160 7.1 Passing job-specific parameters to your tasks 160 7.2 Probing for task-specific information 163 7.3 Partitioning into multiple output files 164 7.4 Inputting from and outputting to a database 169 7.5 Keeping all output in sorted order 171 7.6 Summary 172

...展开详情
所需积分/C币:9 上传时间:2013-03-15 资源大小:15.01MB
举报 举报 收藏 收藏
分享 分享
hadoop in action

详细介绍hadoop及其应用,hadoop in action,包括编程实践和case study

立即下载
hadoop in action中文电子版

hadoop in action 中文版 电子版

立即下载
Hadoop实战:Hadoop in Action

Hadoop实战:Hadoop in Action

立即下载
Hadoop in Action中文影印版

Hadoop in Action中文影印版 Hadoop in Action中文影印版

立即下载
Hadoop实战(Hadoop in Action)

个人整理收藏,书籍包括:《Hadoop in Action.pdf》、《Hadoop实战 第2版》及【Hadoop实战 第2版 源码】

立即下载
Hadoop In Action2

Hadoop In Action 中文第二版 卷二 rar

立即下载
Hadoop In ACTION中文版

本书是一本系统且极具实践指导意义的hadoop工具书和参考书。内容全面,对hadoop整个技术体系进行了全面的讲解,不仅包括hdfs和mapreduce这两大核心内容,而且还包括hive、hbase、mahout、pig、zookeeper、avro、chukwa等与hadoop相关的子项目的内容。

立即下载
Hadoop in Action(英文版)

英文版电子书:Hadoop in Action

立即下载
Hadoop In Action (Hadoop实战)中文版

In Action系列必属精品,带你深入Hadoop世界。

立即下载
Hadoop in Action

Hadoop in Action.pdf

立即下载
ModbusTCP/RTU网关设计

基于UIP协议栈,实现MODBUS联网,可参考本文档资料,有MODBUS协议介绍

立即下载
html+css+js制作的一个动态的新年贺卡

该代码是http://blog.csdn.net/qq_29656961/article/details/78155792博客里面的代码,代码里面有要用到的图片资源和音乐资源。

立即下载
iCopy解码软件v1.0.1.7.exe

解ic,id,hid卡密码破解ic,id,hid卡密码破解ic,id,hid破解ic,id,hid卡破解ic,id,hid卡密码密码卡密码破解ic,id,hid卡...

立即下载
分布式服务框架原理与实践(高清完整版)

第1章应用架构演进1 1.1传统垂直应用架构2 1.1.1垂直应用架构介绍2 1.1.2垂直应用架构面临的挑战4 1.2RPC架构6 1.2.1RPC框架原理6 1.2.2最简单的RPC框架实现8 1.2.3业界主流RPC框架14 1.2.4RPC框架面临的挑战17 1.3SOA服务化架构18 1.3.1面向服务设计的原则18 1.3.2服务治理19 1.4微服务架构21 1.4.1什么是微服务21 1.4.2微服务架构对比SOA22 1.5总结23 第2章分布式服务框架入门25 2.1分布式服务框架诞生背景26 2.1.1应用从集中式走向分布式.26?

立即下载
Camtasia 9安装及破解方法绝对有效

附件中注册方法亲测有效,加以整理与大家共享。 由于附件大于60m传不上去,另附Camtasia 9百度云下载地址。免费自取 链接:http://pan.baidu.com/s/1kVABnhH 密码:xees

立即下载
电磁场与电磁波第四版谢处方 PDF

电磁场与电磁波第四版谢处方 (清晰版),做天线设计的可以作为参考。

立即下载
压缩包爆破解密工具(7z、rar、zip)

压缩包内包含三个工具,分别可以用来爆破解密7z压缩包、rar压缩包和zip压缩包。

立即下载
source insight 4.0.0087 注册机序列号Patched(2017/10/17)

最新的sourceinsight4.0.0087和谐license及和谐文件。真正的4087版本,使用附件中的license文件,替换sourceinsight4.exe

立即下载
Java项目经验汇总(简历项目素材)

Java项目经验汇总(简历项目素材)

立即下载
支付宝转账demo

支付宝单笔转账,实现提现功能(内有demo实例,望大家多多提意见)

立即下载