Learning.Hadoop.2
Title: Learning Hadoop 2 Author: Gabriele Modena, Garry Turkington Length: 316 pages Edition: 1 Language: English Publisher: Packt Publishing Publication Date: 2014-12-29 ISBN-10: 1783285516 ISBN-13: 9781783285518 Design and implement data processing, lifecycle management, and analytic workflows with the cutting-edge toolbox of Hadoop 2 About This Book Construct state-of-the-art applications using higher-level interfaces and tools beyond the traditional MapReduce approach Use the unique features of Hadoop 2 to model and analyze Twitter's global stream of user generated data Develop a prototype on a local cluster and deploy to the cloud (Amazon Web Services) Who This Book Is For If you are a system or application developer interested in learning how to solve practical problems using the Hadoop framework, then this book is ideal for you. You are expected to be familiar with the Unix/Linux command-line interface and have some experience with the Java programming language. Familiarity with Hadoop would be a plus. In Detail This book introduces you to the world of building data-processing applications with the wide variety of tools supported by Hadoop 2. Starting with the core components of the framework?HDFS and YARN?this book will guide you through how to build applications using a variety of approaches. You will learn how YARN completely changes the relationship between MapReduce and Hadoop and allows the latter to support more varied processing approaches and a broader array of applications. These include real-time processing with Apache Samza and iterative computation with Apache Spark. Next up, we discuss Apache Pig and the dataflow data model it provides. You will discover how to use Pig to analyze a Twitter dataset. With this book, you will be able to make your life easier by using tools such as Apache Hive, Apache Oozie, Hadoop Streaming, Apache Crunch, and Kite SDK. The last part of this book discusses the likely future direction of major Hadoop components and how to get involved with the Hadoop community. Table of Contents Chapter 1. Introduction Chapter 2. Storage Chapter 3. Processing – Mapreduce And Beyond Chapter 4. Real-Time Computation With Samza Chapter 5. Iterative Computation With Spark Chapter 6. Data Analysis With Apache Pig Chapter 7. Hadoop And Sql Chapter 8. Data Lifecycle Management Chapter 9. Making Development Easier Chapter 10. Running A Hadoop Cluster Chapter 11. Where To Go Next
- Duibaba2015-05-16好书,学习中,谢谢分享
- XiYangWuYu2015-06-202014年底的,新版,推荐
- erikaeriga2015-05-18这本书非常新,2014年底的,对hadoop的架构和底层细节有很多描述,推荐!
- DViewer2017-02-25介绍了Yarn,不错的书
- tangtong-xj2015-10-29新书,了解hadoop的新变化
- 粉丝: 354
- 资源: 1488
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助