Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版

所需积分/C币:49 2017-06-16 20:03:22 4.46MB PDF
收藏 收藏 7
举报

Spark: The Definitive Guide: Big Data Processing Made Simple English | Oct. 2017 | ISBN-10: 1491912219 | 450 pages | PDF | 4.46 Mb Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of this open-source cluster-computing framework. With an emph
Spark applications 3. USing Spark from Scala. Java, SQL Python, or R 1. Key Concepts 4. Starting Spark 5. SparkSession 6. DataFrames 1. Partitions 7. Trans formations 1, Lazy Evaluation 8. Actions 9. Spark Ul 10. A Basic Trans formation data flow 11. DataFrames and SQL 2. 2. Structured API Overview 1. Spark's Structured APIs 2. DataFrames and datasets 3. Schemas 4. Overview of structured Spark Types 1. Columns 2. Rows 3. Spark value Types 4 Encoders 5. Overview of Spark Execution 1. Logical planning 2. Physical Planning 3. Execution 3. 3. Basic Structured Operations 1. Chapter Overview 2. Schemas 3. Columns and expressions 1. Columns 2.上 xpressions 4. Records and rows 1. Creating RowS 5. Data Frame Trans formations 1. Creating Data Frames 2. Select selectExpr 3. Converting to Spark Types (literals 4. Adding columns 5. Renaming columns 6. Reserved Characters and Keywords in Column Names 7. Removing columns 8. Changing a Columns Type(cast) 9. Filtering rows 10. Getting Unique Rows 1. Random Samples 12. Random Splits 13. Concatenating and Appending rows to a Data Frame 14. Sorting Rows 15. Limit 16. Repartition and coalesce 17. Collecting Rows to the Driver 4. 4. Working with Different Types of data 1. Chapter Overview 1. Where to Look for apis 2. Working with Booleans 3. Working with numbers 4. Working with Strings 1. Regular Expressions 5. Working with Dates and Timestamps 6. Working with Nulls in Data 1. Drop 2. Fill 3. Replace 7. Working with Complex Types 1. Structs 2. Arrays 3. split 4. Array contains 5. Explode 6. Maps 8. Working with JSON 9. User-Defined Functions 5. 5. Aggregations 1. What are aggregations? 2. Aggregation Functions I. count 2. Count Distinct 3. Approximate Count Distinct 4. First and last 5. Min and max 6. Sum 7. sumDistinct 8. Average 9. Variance and Standard Deviation 10. Skewness and Kurtosis 11. Covariance and correlation 12. Aggregating to Complex Types 3. Grouping 1. Grouping with expressions 2. Grouping with Maps 4. Window functions 1. Rollups 2. Cube 3. Pivot 5. USer-Defined Aggregation Functions 1. What is a join? I. Join expressions 2. Join Type 2. nner joins 3. Outer Joins 4. Left Outer Joins 5. Right Outer Joins 6. Left semi joins 7. Left Anti Joins 8. Cross( Cartesian) Joins 9. Challenges with Joins 1. Joins on Complex Types 2. Handling duplicate Column Names 10. How Spark Performs Joins Node-to- Node Communication strategies 7. 7. Data Sources 1. The Data Source APIs 1. Basics of Reading data 2. Basics of writing data 3. Options 2. CSV files 1. CSV Options 2. Reading csv files 3. Writing CSV Files 3. JSON Files 1. JSON Options 2. Reading jSON Files 3. Writing JSON Files 4. Parquet Files 1. Reading Parquet files Writing Parq es 5. ORC Files 1. Reading Orc Files 2. Writing Orc Files 6. SQL Databases 1. Reading from SQL Databases 2. Query Pushdown 3. Writing to SQL Databases 7. Text files 1. Reading Text File 2. Writing Out Text Files 8. Advanced IO Concepts 1. Reading Data in Parallel 2. Writing Data in Parallel 3. Writing Complex Types 8.8 k S OL .SparK sof 1. Spark SQL Concepts 1. What iS SQL? Big Data and SQL: Hive 3. Big Data and SQL: Spark sql 2. How to Run Spark sQL Queries SparksoL Thrift JDBC/ODBC Server 2. Spark SQL CLI 3. Spark's programmatic SoL Interface 3. Tables 1. Creating Table 2. Inserting Into Tables 3. Describing Table Metadata 4. Refreshing table metadata 5. Dropping Tables ews 1. Creatingⅵiews 2. Droppingⅵews 5. Databases 1. Creating Databases 2. Setting The database 3. Dropping Databases 6. Select statements 1. Case When Then statements 7. Advanced Topics 1. Complex Types 2. Functions 3. Spark Managed Tables 4. Subqueries 5. Correlated Predicated Subqueries 8. Conc lusion 9. 9. Datasets 1. What are datasets? 1. Encoders 2. Creating Datasets 1. Case Classes 3. Actions 4. Trans formations 1. Filtering 2. Mapping 5. Joins 6. Grouping and aggregations 1. When to use datasets 10.10, Low Level API Overview 1. The low Level apis 1. When to use the low level APIs? 2. The Sparkconf 3. The Spark Context 4. Resilient distributed datasets 5. Broadcast variables 6. Accumulators Basic Rdd operations 1. RDD Overview 1. Python vs Scala/Java 2. Creating rdds 1. From a collection 2. From Data Sources 3. Manipulating rdds 4. Transformations 1. Distinct ter 3. Map 4. Sorting 5. Random splits 5. Actions 1. Reduce 2. Count 3. first 4 Max and min 5. Take 6. Saving Files 1.saⅴ eAsTextFile 2. SequenceFiles 3. Hadoop files 7. Caching 8. Interoperating between Data Frames, Datasets, and RDDs 9. When to use rdds 1. Performance Considerations: Scala vs Python 2. Rdd of case class vs dataset 12. 12. Advanced RDDs Operations 1. Advanced"Single rDD'Operations 1. Pipe RDDs to System Commands 2. mapPartitions 3. foreachPartition 4. glom 2. Key value Basics(Key-Value RDDs) 1. key 2. Mapping overⅤ alues 3. Extracting Keys and values 4 Look 3. Aggregations 1. countBYKey 2. Understanding aggregation Implementations 3. aggregate 4.△ ggregateByKey 5. Combine bykey 6. foldByKey 7. sample Bykey 4. Co Groups 5. Joins 1. Inner join 2. zips 6. Controlling Partitions 1. coalesce 7. repartitionAndSort WithinPartitions 1. Custom Partitioning 8. repartitionAndSort WithinPartitions 9. Serialization 1313, Distributed variables 1. Chapter Overview 2. Broadcast variables 3. Accumulators 1. Basic Example 2. Custom accumulators 14. 14. Advanced Analytics and Machine Learning 1. The advanced Analytics Workflow 2. Different Advanced Analytics Tasks Supervised learning 2. Recommendation 3. Unsupervised Learning 4. Graph Analysis 3. Spark's Packages for Advanced Analytics 1. What is MLlib? 4. High Level MLlib Concepts 5. MLlib in action transformers 2. Estimators 3. Pipelining our Workflow 4. Evaluators 5. Persisting and Applying Models 6. Deployment Patterns 15. 15. Preprocessing and Feature Engineering 1. Formatting your models according to your use case 2. Properties of Trans formers 3. Different Transformer Types 4. High Level Transformers 1. FOrmula 2. SQLTransformers 3. Vector Assembler 5. Text data trans formers 1. Tokenizing Text 2. Removing Common Words 3. Creating Word Combinations 4. Converting Words into numbers 6. Working with Continuous features 1. Bucketing caling and Normalization 3. StandardScaler 7. Working with Categorical Features 1. Stringlndexer 2. Converting Indexed Values Back to Text 3. Indexing in vectors 4 One hot encoding 8. Feature generation

...展开详情
试读 127P Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版
立即下载 低至0.43元/次 身份认证VIP会员低至7折
    一个资源只可评论一次,评论内容不能少于5个字
    linfang010 非常权威的指南!
    2017-11-13
    回复
    oldunix 下载了,不是正式版!
    2017-10-10
    回复
    guyu1 排版不好,可以大概浏览一下内容
    2017-09-14
    回复
    dddd111ss 排版很差,不值得。还是等正式版吧
    2017-08-31
    回复
    nlslzf 正式版也快了
    2017-08-27
    回复
    henryatzzz 是early release,并非正式发行版,失望
    2017-08-18
    回复
    关注 私信 TA的资源
    上传资源赚积分,得勋章
    最新推荐
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版 49积分/C币 立即下载
    1/127
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第1页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第2页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第3页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第4页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第5页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第6页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第7页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第8页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第9页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第10页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第11页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第12页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第13页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第14页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第15页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第16页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第17页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第18页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第19页
    Spark: The Definitive Guide: Big Data Processing Made Simple 英文高清.pdf版第20页

    试读已结束,剩余107页未读...

    49积分/C币 立即下载 >