Chapter 1, Getting Up and Running with Spark, shows how to install and set up a local development environment for the Spark framework as well as how to create a Spark cluster in the cloud using Amazon EC2. The Spark programming model and API will be introduced, and a simple Spark application will be created using each of Scala, Java, and Python. Chapter 2, Designing a Machine Learning System, presents an example of a real-world use case for a machine learning system. We will design a high-level architecture for an intelligent system in Spark based on this illustrative use case. Chapter 3, Obtaining, Processing, and Preparing Data with Spark, details how to go about obtaining data for use in a machine learning system, in particular from various freely and publicly available sources. We will learn how to process, clean, and transform the raw data into features that may be used in machine learning models, using available tools, libraries, and Spark's functionality. Chapter 4, Building a Recommendation Engine with Spark, deals with creating a recommendation model based on the collaborative fltering approach. This model will be used to recommend items to a given user as well as create lists of items that are similar to a given item. Standard metrics to evaluate the performance of a recommendation model will be covered here. Chapter 5, Building a Classifcation Model with Spark, details how to create a model for binary classifcation as well as how to utilize standard performance-evaluation metrics for classifcation tasks. Chapter 6, Building a Regression Model with Spark, shows how to create a model for regression, extending the classifcation model created in Chapter 5, Building a Classifcation Model with Spark. Evaluation metrics for the performance of regression models will be detailed here. Chapter 7, Building a Clustering Model with Spark, explores how to create a clustering model as well as how to use related evaluation methodologies. You will learn how to analyze and visualize the clusters generated. Chapter 8, Dimensionality Reduction with Spark, takes us through methods to extract the underlying structure from and reduce the dimensionality of our data. You will learn some common dimensionality-reduction techniques and how to apply and analyze them, as well as how to use the resulting data representation as input to another machine learning model. Chapter 9, Advanced Text Processing with Spark, introduces approaches to deal with large-scale text data, including techniques for feature extraction from text and dealing with the very high-dimensional features typical in text data. Chapter 10, Real-time Machine Learning with Spark Streaming, provides an overview of Spark Streaming and how it fts in with the online and incremental learning approaches to apply machine learning on data streams.
- 粉丝: 95
- 资源: 26
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助