没有合适的资源?快使用搜索试试~ 我知道了~
Apache Oozie.pdf
3星 · 超过75%的资源 需积分: 13 73 下载量 89 浏览量
2017-09-17
11:31:22
上传
评论
收藏 6.8MB PDF 举报
温馨提示
########################################## Apache Oozie.pdf ########################################
资源推荐
资源详情
资源评论
DATA
Apache Oozie
ISBN: 978-1-449-36992-7
US $39.99 CAN $45.99
“
In this book, the
authors have striven for
practicality, focusing on
the concepts, principles,
tips, and tricks that
developers need to get
the most out of Oozie.
A volume such as this is
long overdue. Developers
will get a lot more out of
the Hadoop ecosystem
by reading it.
”
—Raymie Stata
CEO, Altiscale
“
Oozie simplifies
the managing and
automating of complex
Hadoop workloads.
This greatly benefits
both developers and
operators alike.
”
—Alejandro Abdelnur
Creator of Apache Oozie
Twitter: @oreillymedia
facebook.com/oreilly
Get a solid grounding in Apache Oozie, the workflow scheduler system for
managing Hadoop jobs. In this hands-on guide, two experienced Hadoop
practitioners walk you through the intricacies of this powerful and flexible
platform, with numerous examples and real-world use cases.
Once you set up your Oozie server, you’ll dive into techniques for writing
and coordinating workflows, and learn how to write complex data pipelines.
Advanced topics show you how to handle shared libraries in Oozie, as well
as how to implement and manage Oozie’s security capabilities.
■ Install and congure an Oozie server, and get an overview of
basic concepts
■ Journey through the world of writing and conguring
workows
■ Learn how the Oozie coordinator schedules and executes
workows based on triggers
■ Understand how Oozie manages data dependencies
■ Use Oozie bundles to package several coordinator apps into
a data pipeline
■ Learn about security features and shared library management
■ Implement custom extensions and write your own EL functions
and actions
■ Debug workows and manage Oozie’s operational details
Mohammad Kamrul Islam works as a Staff Software Engineer in the data
engineering team at Uber. He’s been involved with the Hadoop ecosystem
since 2009, and is a PMC member and a respected voice in the Oozie com-
munity. He has worked in the Hadoop teams at LinkedIn and Yahoo.
Aravind Srinivasan is a Lead Application Architect at Altiscale, a Hadoop-
as-a-service company, where he helps customers with Hadoop application
design and architecture. He has been involved with Hadoop in general and
Oozie in particular since 2008.
Mohammad Kamrul Islam &
Aravind Srinivasan
Apache
Oozie
THE WORKFLOW SCHEDULER FOR HADOOP
Apache Oozie
Islam & Srinivasan
www.it-ebooks.info
DATA
Apache Oozie
ISBN: 978-1-449-36992-7
US $39.99 CAN $45.99
“
In this book, the
authors have striven for
practicality, focusing on
the concepts, principles,
tips, and tricks that
developers need to get
the most out of Oozie.
A volume such as this is
long overdue. Developers
will get a lot more out of
the Hadoop ecosystem
by reading it.
”
—Raymie Stata
CEO, Altiscale
“
Oozie simplifies
the managing and
automating of complex
Hadoop workloads.
This greatly benefits
both developers and
operators alike.
”
—Alejandro Abdelnur
Creator of Apache Oozie
Twitter: @oreillymedia
facebook.com/oreilly
Get a solid grounding in Apache Oozie, the workflow scheduler system for
managing Hadoop jobs. In this hands-on guide, two experienced Hadoop
practitioners walk you through the intricacies of this powerful and flexible
platform, with numerous examples and real-world use cases.
Once you set up your Oozie server, you’ll dive into techniques for writing
and coordinating workflows, and learn how to write complex data pipelines.
Advanced topics show you how to handle shared libraries in Oozie, as well
as how to implement and manage Oozie’s security capabilities.
■ Install and congure an Oozie server, and get an overview of
basic concepts
■ Journey through the world of writing and conguring
workows
■ Learn how the Oozie coordinator schedules and executes
workows based on triggers
■ Understand how Oozie manages data dependencies
■ Use Oozie bundles to package several coordinator apps into
a data pipeline
■ Learn about security features and shared library management
■ Implement custom extensions and write your own EL functions
and actions
■ Debug workows and manage Oozie’s operational details
Mohammad Kamrul Islam works as a Staff Software Engineer in the data
engineering team at Uber. He’s been involved with the Hadoop ecosystem
since 2009, and is a PMC member and a respected voice in the Oozie com-
munity. He has worked in the Hadoop teams at LinkedIn and Yahoo.
Aravind Srinivasan is a Lead Application Architect at Altiscale, a Hadoop-
as-a-service company, where he helps customers with Hadoop application
design and architecture. He has been involved with Hadoop in general and
Oozie in particular since 2008.
Mohammad Kamrul Islam &
Aravind Srinivasan
Apache
Oozie
THE WORKFLOW SCHEDULER FOR HADOOP
Apache Oozie
Islam & Srinivasan
www.it-ebooks.info
978-1-449-36992-7
[LSI]
Apache Oozie
by Mohammad Kamrul Islam and Aravind Srinivasan
Copyright © 2015 Mohammad Islam and Aravindakshan Srinivasan. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (http://safaribooksonline.com). For more information, contact our corporate/
institutional sales department: 800-998-9938 or corporate@oreilly.com.
Editors: Mike Loukides and Marie Beaugureau
Production Editor: Colleen Lobner
Copyeditor: Gillian McGarvey
Proofreader: Jasmine Kwityn
Indexer: Lucie Haskins
Interior Designer: David Futato
Cover Designer: Ellie Volckhausen
Illustrator: Rebecca Demarest
May 2015: First Edition
Revision History for the First Edition
2015-05-08: First Release
See http://oreilly.com/catalog/errata.csp?isbn=9781449369927 for release details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Apache Oozie, the cover image of a
binturong, and related trade dress are trademarks of O’Reilly Media, Inc.
While the publisher and the authors have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility
for errors or omissions, including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to ensure that your use
thereof complies with such licenses and/or rights.
www.it-ebooks.info
Table of Contents
Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
1.
Introduction to Oozie. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Big Data Processing 1
A Recurrent Problem 1
A Common Solution: Oozie 2
A Simple Oozie Job 4
Oozie Releases 10
Some Oozie Usage Numbers 12
2.
Oozie Concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Oozie Applications 13
Oozie Workflows 13
Oozie Coordinators 15
Oozie Bundles 18
Parameters, Variables, and Functions 19
Application Deployment Model 20
Oozie Architecture 21
3.
Setting Up Oozie. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Oozie Deployment 23
Basic Installations 24
Requirements 24
Build Oozie 25
Install Oozie Server 26
Hadoop Cluster 28
iii
www.it-ebooks.info
剩余270页未读,继续阅读
资源评论
- 和蔼的玉兔君2017-09-28Oozie 光看介绍还是不如操作一遍好懂
- cangkutou2021-05-14英文版的,也不标一下
江荣波
- 粉丝: 488
- 资源: 49
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功