Scala: Guide for Data Science Professionals

所需积分/C币:28 2017-04-22 19:57:36 14.71MB PDF

"Scala: Guide for Data Science Professionals (Learning Path)" ASIN: B06XCJVY21, eISBN: 1787282856 | 2017 | True PDF | 1100 pages | 15 MB Scala will be a valuable tool to have on hand during your data science journey for everything from data cleaning to cutting-edge machine learning About This Book
Scala: Guide for data Science Professionals Copyright o 2017 Packt Publishing All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews rt has been made in the preparation of this course to ensure the accuracy of the information presented However the information contained in this course is sold without warranty, cither express or implied. Neither the authors nor packt Publishing and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this course Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this course by the appropriate use of capitals However, Packt Publishing cannot guarantee the accuracy of this information Published on: January 2017 Published by Packt Publishing Ltd P 35 Livery street Birmingham B3 2PB, UK ISBN978-1-78728-285-8 Credits Authors Content Development editor Pascal Bugnion Trusha shriyan Arun manivannan Patrick R. Nicolas Graphics Kirk d'Penha Reviewers Manga bista Production coordinator Radek ostrowski Shantanu N. Agade Yuanhang Wang Amir Hajian Shams mahmood Imam Gerald loeffler Subhajit Datta Rui goncalves Patricia hoffman phD Md zahidul islam Preface cala is a popular language for data science. By emphasizing immutability and functional constructs, Scala lends itself well to the construction of robust libraries for concurrency and big data analysis a rich ecosystem of tools for data science has therefore developed around Scala, including libraries for accessing SQL and NOSQL databases, frameworks for building distributed applications like Apache Spark and libraries for linear algebra and numerical algorithms. We will explore this rich and growing ecosystem in this learning path What this learning path covers Module 1, Scala for Data Science, will introduce you to the libraries for ingesting, storing, manipulating, processing, and visualizing data in Scala. Packed with real- world examples and interesting data sets, this module will teach you to ingest data from flat files and web apis and store it in a sql or nosql database. It will show you how to design scalable architectures to process and modeling your data, starting from simple concurrency constructs such as parallel collections and futures through to actor systems and Apache Spark. As well as Scala's emphasis on functional structures and immutability you will learn how to use the right parallel construct for the job at hand, minimizing development time without compromising scalability Finally, you will learn how to build beautiful interactive visualizations using web frameworks. This module gives tutorials on some of the most common Scala libraries for data science, allowing you to quickly get up to speed with building data science and data engineering solutions Preface Module 2, Scala data analysis Cookbook, will introduce you to the most popular Scala tools, libraries, and frameworks through practical recipes around loading, manipulating, and preparing your data. It will also help you explore and make sense of your data using stunning and insightful visualizations, and machine learning toolkits. Starting with introductory recipes on utilizing the Breeze and Spark libraries, get to grips with how to import data from a host of possible sources and how to pre-process numerical, string, and date data. Next, you ll get an understanding of concepts that will help you visualize data using the apache Zeppelin and Bokeh bindings in Scala, enabling exploratory data analvsis. discover how to program quintessential machine learning algorithms using Spark Ml library Work through steps to scale your machine learning models and deploy them into a standalone cluster, EC2, YARN, and Mesos. Finally dip into the powerful options presented by Park Streaming, and machine learning for streaming data, as well as utilizing Spark graphx Module 3, Scala for Machine Learning, will introduce you to the functional capabilities of the Scala programming language that are critical to the creation of machine learning algorithms such as dependency injection and implicits. Your learning journey starts with data pre-processing and filtering techniques, ther move on to clustering and dimension reduction, Naive Bayes, regression models, sequential data, regularization and kernelization, support vector machines, Neural networks, generic algorithms and re-enforcement learning. The review of the Akka framework and Apache Spark clusters concludes the tutorial Techniques throughout the module is applied to the analysis, recommendation, classification, and prediction of financial markets This module will guide you through the process of building Al applications with diagrams, formal mathematical notation, source code snippets and useful tips What you need for this learning path The examples provided in this learning path require that you have a working Scala installation and SBT, the Simple Build Tool, a command line utility for compiling and running Scala code. We will walk you through how to install these in the next sections. We do not require a specific DE. The code examples can be written in your favorite text editor or ide Who this learning path is for This learning path is perfect for those who are comfortable with Scala programming and now want to enter the field of data science. Some knowledge of statistics is Pre Reader feedback Feedback from our readers is always welcome. let us know what you think about this course-what you liked or disliked. Reader feed back is important for us as it helps us develop titles that you will really get the most out of Tosendusgeneralfeedback,,andmention the course's title in the subject of your message If there is a topic that you have expertise in and you are interested in ei ither writing Customer support Now that you are the proud owner of a Packt course, we have a number of things to help you to get the most from your purchase Downloading the example code You can download the example code files for this course from your account at http://www.packtpub.comIfyoupurchasedthiscourseelsewhereyoucanvisit to you You can download the code files by following these steps 1. Log in or register to our website using your e-mail address and password 2. Hover the mouse pointer on the SUPPort tab at the top 3. Click on Code downloads errata 4. Enter the name of the course in the search box 5. Select the course for which you're looking to download the code files 6. Choose from the drop-down menu where you purchased this course from 7. Click on code download You can also download the code files by clicking on the code files button on the course's webpage at the Packt Publishing website. This page can be accessed by entering the course's name in the Search box. Please note that you need to be logged in to your Packt account Preface Once the file is downloaded please make sure that you unzip or extract the folder using the latest version of Winrar /7-Zip for Windows Zipeg /iip/ UnRarX for Mac 7-Zip/ PeaZip for linux Thecodebundleforthecourseisalsohostedongithubathttps://aithud.sne com/PacktPublishing/scala-Guide-for-Data-Science-Professionals We also have other code bundles from our rich catalog of books videos, and Errata Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our courses -maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions ofthiscourseiFyoufindanyerratapleasereportthembyvisitinghttp://www packtpub. com/submit-errata, sclecting your course, clicking on the errata Submission Form link and entering the details of your errata Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the errata section of that title Toviewthepreviouslysubmittederratagoto content/support and enter the name of the course in the search field. The required information will appear under the Errata section Piracy Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriousl If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy Please contact us at copyright@packtpub com with a link to the suspected pirated material We appreciate your help in protecting our authors and our ability to bring you valuable content Preface Questions If you have a problem with any aspect of this course, you can contact us at,andwewilldoourbesttoaddresstheproblem


评论 下载该资源后可以进行评论 1

Skyssik 特别好的资源

关注 私信 TA的资源