没有合适的资源？快使用搜索试试~ 我知道了~

文库首页开发技术其它Learning Spark Streaming Best Practices for Scaling and Optimizing Apache 无水印pdf

Learning Spark Streaming Best Practices for Scaling and Optimizi...

Learning

Spark

Streaming

Best

Practices

5星 · 超过95%的资源需积分: 9 41 下载量 98 浏览量 2017-10-18 22:53:19 上传评论收藏 5.96MB PDF 举报

温馨提示

试读

287页

Learning Spark Streaming Best Practices for Scaling and Optimizing Apache Spark(Early Release) 英文无水印pdf pdf所有页面使用FoxitReader和PDF-XChangeViewer测试都可以打开本资源转载自网络，如有侵权，请联系上传者或csdn删除本资源转载自网络，如有侵权，请联系上传者或csdn删除

资源详情

资源评论

2. A distributed file system

3. Two higher-order functions

3. Optimizations in a reduce operation

1. Associativity : a necessary condition.

2. Shuffling

3. Map-side combiner

4. To Learn more about MapReduce

1. The Spark ecosystem, approach and polyglot APIs

2. Multiple frameworks, and a framework scheduler

3. A Data Processing engine

4. A polyglot API

5. A MapReduce extension

6. A SQL interface, expanding into a DataFrame interface.

7. A Real Time processing engine

8. In-memory computing, with impact on processing speed and latency

9. MapReduce and memory legacy

10. Spark’s Memory Usage

11. A customizable cache

12. Operation Latency

5. How Spark Streaming fits in the Big Picture

1. Micro-batching

2. A strong Streaming characteristic

3. A minimal delay

4. Throughput-oriented tasks

6. Why you would want to use Spark Streaming

1. Building a pipeline

2. Productive deployment of pipelines

3. Productive implementation of data analysis

7. To learn more about Spark

8. Conclusion

9. Bibliography

2. 2. Core Spark Streaming concepts

1. Apache Spark RDDs

1. Resilient Distributed Datasets

2. Transformations and Actions

3. The Shuffle

4. Partitions

5. Debugging RDDs

6. Witnessing caching

2. Spark Streaming Clusters

1. The Standalone Spark cluster

2. Yet Another Resource Negotiator (YARN)

3. Apache Mesos

4. Spark Streaming : a delicate deployment

3. To learn more about runinng Spark on a cluster

4. Fundamentals of a DStream

1. A Bulk-synchronous model

2. The Spark Streaming Context

1. 1. Introducing Spark Streaming

1. Large-scale data analytics and Apache Spark

2. More than MapReduce : how the model came about and how Spark extends it.

1. A Fault-tolerant MapReduce cluster

3. Representing regular updates to a fixed window of data

4. The Receiver Model

5. Receiver parallelism

5. Conclusion

6. Bibliography

3. 3. Streaming application design

1. Starting with an example : Twitter analysis

1. The Spark Notebook

2. Creating a Streaming Application

3. Creating a Stream

4. Transformations

5. Actions and Dataflow

6. Expressing a Dataflow

7. Starting the Spark Streaming Context

8. Summary

2. Windowed Streams

1. Windowed Streams

2. A word on changing the batch interval

3. Slicing your Stream

3. Other Data Sources and Connectors

1. Apache Kafka

2. Apache Flume

3. Kinesis

4. Apache Bahir

5. How to write a quick stream generator for testing : SocketStream ,

FileStream , QueueStream

4. The Lambda Architecture

1. The evolution of ideas, rather than products

2. A classical but difficult example

3. Batch processing and a program’s life time

4. A Streaming improvement

5. A fundamental difficulty: back to the Lambda architecture ?

5. Saving Streams

1. Stream Output and other operations

2. A word on content selection

3. Reasons for saving a stream and scaling into real-time

4. How to Save Streams with DataFrames

6. Bibliography

4. 4. Creating robust deployments

1. Using spark-submit

2. Thinking about reliability in Spark Streaming: Closures and Function-Passing Style

3. Spark’s Reliability primitives

4. Spark’s Fault Tolerance Guarantees

1. The External shuffle service

2. Cluster-mode deployment

3. Checkpointing

4. A hot-swappable master through Zookeeper

5. Fault-tolerance in Spark Streaming: the context of the Receiver model

6. Spark Streaming’s Zero Data Loss guarantees

7. Cluster managers and driver restart

8. Comparing cluster managers

9. Job stability: A time budget question

1. Batch interval and processing delay

2. Going deeper : scheduling delay and processing delay

3. Fixed-rate throttling

10. Backpressure

1. Why backpressure

2. Dynamic throttling

3. Tuning the backpressure PID

11. Fault tolerance in Spark Streaming

1. Planning for side effect stutter in transformations

2. Idempotent side effects for exactly once processing

3. Checkpointing and its importance

12. The Reliable Receiver and the Write-Ahead Log

13. Apache Kafka and the DirectKafkaReceiver

1. The Kafka model and its Receiver

14. Parallel consumers

1. The Receiver model vs. reliable receivers

15. Bibliography

5. 5. Streaming Programming API

1. Basic Stream transformations

1. Element-centric DStream Operations

2. RDD-centric DStream Operations

3. Counting

2. Output Operations

1. foreachRDD

2. 3rd Party Output Operations

3. Spark SQL and Spark Streaming

4. Spark SQL

1. Accessing Spark SQL Functions From Spark Streaming

2. Dealing with Data at Rest

3. Join Optimizations

4. Updating Reference Data

5. Stateful Streaming Computation

1. UpdateStateByKey

2. Statefulness at the scale of a stream

3. updateStateByKey and its limitations

4. mapwithState

5. Using mapWithState

6. Event-time Stream computation with mapWithState

6. Dynamic Windows

1. reduceByWindow

2. Invertible Aggregations

7. Caching

8. Measuring and Monitoring

1. The Streaming UI

2. The Monitoring API

3. Conclusion

9. Bibliography

Learning Spark Streaming

First Edition

Francois Garillot and Gerard Maas

剩余286页未读，继续阅读

评论收藏

内容反馈

kj_jiupingzi

2019-01-25

学习中，谢谢

Learning Spark Streaming Best Practices for Scaling and Optimizi...

评论1

最新资源

Learning Spark Streaming Best Practices for Scaling and Optimizi...

评论1

最新资源

相关推荐

High Performance Spark Best Practices for Scaling and Optimizing Apache 无水印pdf

learning-spark-streaming.pdf

learning-spark-streaming

High Performance Spark Best Practices for Scaling and Optimizing Apache epub

High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark

high-performance-spark

High Performance Spark Best Practices for Scaling and Optimizing Apache azw3

High.Performance.Spark.Best.Practices.for.Scaling.and.Optimizing.Apache.Spark.

High Performance Spark Best Practices for Scaling and Optimizing Apache Spark

Learning Spark SQL Architect streaming analytics and machine learning solutions

Building Data Streaming Applications with Apache Kafka 无水印pdf

Spark for Python Developers 无水印pdf 0分

Next-Generation Video Coding and Streaming 无水印pdf

sparkstreaming.zip

Pro Spark Streaming(Apress,2016)

sparkStreaming消费数据不丢失

Streaming Systems(EarlyRelease) 无水印pdf

spark Streaming和structed streaming分析

Apache Spark 2.x for Java Developers

Machine Learning with Spark(PACKT,2015)

Big Data Analytics with Spark 无水印pdf 0分

7.SparkStreaming（上）--SparkStreaming原理介绍.pdf

Building Data Streaming Applications with Apache Kafka

Hadoop原理与技术Spark Streaming操作实验

Building Data Streaming Applications with Apache Kafka azw3

Qt 5实现串口调试助手 （源工程文件、0积分下载）

【SystemVerilog】路科验证V2学习笔记（全600页）.pdf

AutoSAR标准协议4.2.2

光伏-储能并网系统仿真.rar

Qt 5实现串口调试助手（源工程文件、0积分下载）