所需积分/C币:19 2016-04-15 15:17:39 7.92MB PDF
收藏 收藏

Summary Storm Applied is a practical guide to using Apache Storm for the real-world tasks associated with processing and analyzing real-time data streams. This immediately useful book starts by building a solid foundation of Storm essentials so that you learn how to think about designing Storm solutions the right way from day one. But it quickly dives into real-world case studies that will bring the novice up to speed with productionizing Storm. About the Technology It's hard to make sense out of data when it's coming at you fast. Like Hadoop, Storm processes large amounts of data but it does it reliably and in real time, guaranteeing that every message will be processed. Storm allows you to scale with your data as it grows, making it an excellent platform to solve your big data problems. About the Book Storm Applied is an example-driven guide to processing and analyzing real-time data streams. This immediately useful book starts by teaching you how to design Storm solutions the right way. Then, it quickly dives into real-world case studies that show you how to scale a high-throughput stream processor, ensure smooth operation within a production cluster, and more. Along the way, you'll learn to use Trident for stateful stream processing, along with other tools from the Storm ecosystem. This book moves through the basics quickly. While prior experience with Storm is not assumed, some experience with big data and real-time systems is helpful. Table of Contents Chapter 1 Introducing Storm Chapter 2 Core Storm concepts Chapter 3 Topology design Chapter 4 Creating robust topologies Chapter 5 Moving from local to remote topologies Chapter 6 Tuning in Storm Chapter 7 Resource contention Chapter 8 Storm internals
Storm applied Strategies for realtime event processing SEAN T ALLEN MATTHEW JANKOWSKI PETER PATHIRANA MANNING SHELTER ISLAND For online information and ordering of this and other Manning books, please visit For more information, please contact Special Sales Department Manning publications co 20 Baldwin road PO BoX 761 Shelter island. ny 11964 @2015 by Manning Publications Co. All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted,in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher: Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Mannin Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps o Recognizing the importance of preserving what has been written, it is Mannings policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine Manning publications co Development editor: Dan maharry 20 Baldwin road Technical development editor Aaron Colcord PO BoX 761 Copyeditor: Elizabeth Welch Shelter island. ny11964 Proofreader: Melody dolab Technical proofreader: Michael Rose Typesetter: Dennis Dalinnik Cover desis Marija fude ISBN:9781617291890 Printed in the united states of america 12345678910-EBM-201918171615 brief contents 1■ Introducing Storm Core Storm concepts 12 3■ Topology design33 4 Creating robust topologies 76 Moving from local to remote topologies 102 6■ Tuning in storm130 7 Resource contention 161 Storm internals 187 Trident 207 contents foreword xiii preface xU acknowledgments xvii about this book xix about the cover illustration xxiii Introducing Storm I 1.1 What is big data? 2 The four vs of big data 2 Big data tools 3 2 How Storm fits into the big data picture 6 Storm us. the usual suspects 8 1. 8 Why you'd want to use Storm 10 1. 4 Summary ll Core Storm concepts 12 2.1 Problem definition: GitHub commit count dashboard 12 Data: starting and ending points 13. Breaking down the problem 14 2.2 Basic Storm concepts 14 opology15·Tple15· Stream18Spot19 Bol20· Stream grouping22 CoNTENTS 2.8 Implementing a GitHub commit count dashboard in Storm 24 Setting up a Storm project 25. Implementing the spout 25 Implementing the bolts 28 Wiring everything together to form the topology 31 2.4 Summary 32 Topology design 33 8. 1 AppI roaching topology design 34 8.2 Problem definition: a social heat map 34 Formation of a conceptual solution 35 8.8 Precepts for mapping the solution to Storm 35 Consider the requirements imposed by the data stream 36 Represent data points as tuples 37: Steps for determining the topology composition 38 3.4 Initial implementation of the design 40 Spout: read data from a source 41 Bolt: connect to an external service 42 Bolt: collect data in-memory 44 Bolt: persisting to a data store 48. Defining stream groupings between the components 51 Building a topology for running in local cluster mode 51 3.5 Scaling the topology 52 Understanding parallelism in Storm 54. Adjusting the topology to address bottlenecks inherent within design 58 Adjusting the topology to address bottlenecks inherent within a data stream 64 3.6 Topology design paradigms 69 Design by breakdown into functional components 70 Design by breakdown into components at points of repartition 7i Simplest functional components vs. lowest number of repartitions 74 3.7 Summary 74 Creating robust topologies 76 4.1 Requirements for reliability 76 Pieces of the puzzle for supporting reliability 77 4.2 Problem definition: a credit card authorization system 77 a conceptual solution with retry characteristics 78 Defining the data points 79. Mapping the solution to storm with retry characteristics 80 CONTENTS 4.3 Basic implementation of the bolts 81 The Authorize creditcard implementation 82 The ProcessedOrder notification implementation 83 4.4 Guaranteed message processing 84 Tuple states: fally processed vs failed 84 Anchoring, acking, and failing tuples in our bolts 86.A spout's role in guaranteed message processing 4.5 Replay semantics 94 Degrees of reliability in storm 94 Examining exactly once processing in a Storm topology 95. Examining the reliability uarantees in our topology 95 4.6 Summary 101 5 Moving from local to remote topologies 102 5.1 The Storm cluster 103 The anatomy of a worker node 104. Presenting a worker node within the context of the credit cara authorization topology 106 5.2 Fail-fast philosophy for fault tolerance within a Storm cluster 108 5.3 Installing a Storm cluster 109 Setting up a Zookeeper cluster 109. Installing the required storm dependencies to master and worker nodes 110. Installing Storm to master and worker nodes 110. Configuring the master and worker nodes via storm. yaml 110. Launching nimbus and Supervisors under supervision 111 5. 4 Getting your topology to run on a Storm cluster 112 put together the topology components 112 Running topologies in local mode 113 running topologies on a remote Storm cluster 114 Deploying a topology to a remote storm cluster 114 5.5 The Storm ui and its role in the storm cluster 116 Storm Ul: the Storm cluster summary 116 Storm Ul individual Topology summary 120.Storm Ul: individual spout/bolt summary 124 summar 129

试读 127P Storm.Applied.Strategies.for.real-time.event.processing
立即下载 低至0.43元/次 身份认证VIP会员低至7折
xuxinhai 很好很不错
zliyll Storm的书籍非常少,这本算非常不错的了,谢谢楼主上传
学习先生 是原版,很清晰
wufengWHU 里面的trident 例子非常详细,解决我工作中的问题
花花小Boy 是原版,很清晰
  • 至尊王者

关注 私信
Storm.Applied.Strategies.for.real-time.event.processing 19积分/C币 立即下载

试读结束, 可继续阅读

19积分/C币 立即下载