下载  >  开发技术  >  其它  > Data-Science-from-Scratch-First-Principles-with-Python.pdf.pdf

Data-Science-from-Scratch-First-Principles-with-Python.pdf.pdf 评分:

Data-Science-from-Scratch-First-Principles-with-Python.pdf
Data science from scratch Joel grus Beng. Cambridge. Farnham·Kn· Sebastopol, Tokyo OREILLY° Data Science from scratch by Joel grus Copyright o 2015 OReilly Media. All rights reserved Printed in the United states of america Published by O reilly Media, Inc, 1005 Gravenstein Highway North, Sebastopol, CA95472 OReilly books may be purchased for educational, business, or sales promotional use. Online editions are alsoavailableformosttitles(http://safaribooksonline.com).Formoreinformationcontactourcorporate institutionalsalesdepartment800-998-9938orcorporate@oreilly.com Editor: Marie Beaugureau Indexer: Ellen Troutman-Zaig Production Editor: Melanie Yarbrough Interior Designer: David Futato Copyeditor: Nan Reinhardt Cover Designer: Karen Montgomer Proofreader: Eileen cohen Illustrator: Rebecca Demarest April 2015 First edition Revision History for the First Edition 2015-04-10: First Release Seehttp://oreilly.com/catalog/errata.csp?isbn=9781491901427forreleasedetails The O reilly logo is a registered trademark of o reilly media, Inc. Data Science from Scratch, the cover image of a rock Ptarmigan, and related trade dress are trademarks of o reilly media, inc While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/ or rights 978-1-491-90142-7 ILSI Table of contents Preface XI 1. Introduction The ascend: of da What is data science? Motivating Hypothetical: DataSciencester Finding Key Connectors 111236 Data Scientists You May Know Salaries and Experience 8 Paid Accounts Topics of Interest Onward 2. A Crash Course in Python The Ba Getting pith g non P e zen or Py on Whitespace Formatting Arithmetic Functions 35556678899 Strings Exceptions Lists 20 Tuples 21 Dictionaries 21 ets 24 Control flo 25 Truthine 25 The Not-So-Basics Sorting 27 List Comprehensions 27 Generators and iterators 28 Randomness 29 Regular expressions 30 Object-Oriented Programming 30 Functional tools 31 enumerate 32 zip and argument Unpacking 33 args and kwargs 34 Welcome to dataSciencester 35 For Further Exploration 35 3. Visualizing Data matplotlib 37 Bar charts 39 Line charts 43 terplots 44 For Further Exploration 47 4. Linear Algebra 49 Vectors 49 Matrices 53 For Further Exploration 55 5. Statistics 57 Describing a Single Set of data 57 Central Tendencies 59 Dispersion 61 Correlation 62 Simpsons paradox Some other correlational caveats 66 Correlation and causation 67 For Further Exploration 68 6. Probabili ,,,69 Dependence and independence Conditional Probability 70 Bayes's Theorem 72 Random variables iv Table of Contents Continuous distributions The normal Distribution The Central Limit Theorem 78 For Further exploration 80 7. Hypothesis and Inference. n81 Statistical Hypothesis Testing 81 Example: Flipping a Coin 81 Confidence intervals P-hacking 86 Example: Running an A/B Test 87 Bayesian Inference 88 For Further Exploration 92 8. Gradient descent The Idea behind gradient Descent 93 Estimating the gradient 94 Usi g the Gradient 97 Choosing the right Step size 97 Putting it all together 8 Stochastic gradient descent 99 For Further Exploration 100 Getting Data 103 stdin and stdout 103 Reading files 105 The Basics of Text files 105 Delimited files 106 Scraping the Web 108 HTML and the parsing Thereof 108 Example: O Reilly books about Data 110 Using APIs 114 jSON (and XML) 114 Using an Unauthenticated API 115 Finding apis 116 Example: Using the Twitter APIs 117 Getting Credentials 117 For Further Exploration 120 10. Working with Data. 121 Exploring Your Data 121 Exploring one-Dimensional Data 121 Table of contents Two Dimensions 123 Many dimensions 125 Cleaning and Munging 127 Manipulating data 129 Rescaling 132 Dimensionality Reduction 134 For Further Exploration 139 11. Machine Learning 141 Modeling 141 What Is Machine Learning 142 Overfitting and Underfittin g 142 Correctness 145 The Bias-Variance Trade-off 147 Feature Extraction and selection 148 For Further Exploration 150 12. k-Nearest Neighbors. 151 The model 151 Example: Favorite Languages 153 The Curse of Dimensionality 156 For Further Exploration 163 13. Naive bayes. 165 A Really Dumb Spam Filter 165 A More Sophisticated Spam Filter 166 Implementation 168 Testing Our Model 169 For Further Exploration 172 14. Simple linear regression ,173 The model 173 Using gradient Descent 176 Maximum Likelihood estimation 177 For Further Exploration 177 15. Multiple Regression.…,,…, The model 179 Further Assumptions of the Least Squares Model 180 Fitting the model 181 Interpreting the model 182 Goodness of Fit 183 Table of contents Digression: The Bootstrap 183 Standard Errors of Regression Coefficients 184 Regularization 186 For Further Exploration 188 16. Logistic Regression The Problem 189 The Logistic Function 192 Applying the Model 194 Goodness of Fit 195 Support Vector Machines 196 For Further Investigation 200 17. Decision trees 201 What is a decision tree? 201 Entropy 20 The entropy of a partition 205 Creating a Decision Tree 206 Putting It All Together 208 Random forests 211 For Further Exploration 212 18. Neural Networks.44....4..213 Perceptrons 213 Feed-Forward Neural Networks 215 Ba backpropagation 218 Example: Defeating a CAPTCHA 219 For Further Exploration 224 19.〔 lustering 225 The idea 225 The model 226 Example: Meetups 227 Choosing k 230 Example: Clustering Colors 231 Bottom-up Hierarchical Clustering 233 For Further Exploration 238 20. Natural Language Processing ,239 Word Clouds 239 n-gram Models 241 Grammars 244 Table of contents|ⅶi An Aside: Gibbs Sampling 246 Topic Modeling 247 For Further Exploration 253 21. Network analysis 255 Betweenness Centrality 255 Eigenvector Centrality 260 Matrix Multiplication 260 Centrali 262 Directed Graphs and PageRank 264 For Further Exploration 266 22. Recommender systems. 267 Manual curation 268 Recommending What's Popular 268 User-Based Collaborative Filtering 269 Item-Based Collaborative Filtering 272 For Further Exploration 274 23. Databases and SQL............................ 275 CREATE TABLE and INsert 275 UPDATE 277 DELETE 278 SELECT 278 GROUP BY 280 ORDER BY 282 JOIN 283 Subqueries 285 Indexes 285 Query optimization 286 NOSQL 287 For Further exploration 287 24. Map Reduce 289 Example: Word Count 289 Why map reduce? 291 Map Reduce More Generally 292 Example: Analyzing Status Updates 293 Example: Matrix Multiplication 294 An aside: Combiners 296 For Further Exploration 296 I Table of Contents

...展开详情
2019-09-14 上传 大小:5.06MB
举报 收藏
分享

评论 下载该资源后可以进行评论 共1条

jaying6 这是第一版,想要第二版的就别来了。
2019-11-30
回复
Data Science from Scratch First Principles with Python 无水印pdf

Data Science from Scratch First Principles with Python 英文无水印pdf pdf所有页面使用FoxitReader和PDF-XChangeViewer测试都可以打开 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除

立即下载
数据科学入门(Data Science from Scratch 中文版).pdf

数据科学入门_P286_2016.03.pdf [OReilly].Data.Science.from.Scratch.First.Principles.with.Python.2015 中文版

立即下载
Data Science from Scratch First Principles with Python

数据科学入门,第二版, 介绍数据科学基本知识的重量级读本,Google数据科学家作品。   数据科学是一个蓬勃发展、前途无限的行业,有人将数据科学家称为“21世纪头号性感职业”。本书从零开始讲解数据科学工作,教授数据科学工作所必需的黑客技能,并带领读者熟悉数据科学的核心知识——数学和统计学。   作者选择了功能强大、简单易学的Python语言环境,亲手搭建工具和实现算法,并精心挑选了注释良好、简洁易读的实现范例。书中涵盖的所有代码和数据都可以在GitHub上下载。   通过阅读本书,你可以:   学到一堂Python速成课;   学习线性代数、统计和概率论的基本方法,了解它们是怎样应用在数据

立即下载
Data Science from Scratch - First Principles with Python.2015

Joel Grus ■■ Get a crash course in Python ■■ Learn the basics of linear algebra, statistics, and probability— and understand how and when they're used in data science ■■ Collect, explore, clean, munge, and manipulate data ■■ Dive into the fundamentals of machine learning ■■ Implement models such as

立即下载
Data.Science.from.Scratch.First.Principles.with.Python

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by imple

立即下载
Data Science from Scratch- First Principles with Python(O'Reilly,2015)

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by imple

立即下载
The Enterprise Big Data Lake

The Enterprise Big Data Lake: Delivering the Promise of Big Data and Data Science By 作者: Alex Gorelik ISBN-10 书号: 1491931558 ISBN-13 书号: 9781491931554 Edition 版本: 1 出版日期: 2019-03-24 pages 页数: (224 ) $69.99 Enterprises are experimenting with using Hadoop to build Big Data Lakes, but many projects ar

立即下载
Java: High-Performance Apps with Java 9

Java: High-Performance Apps with Java 9: Boost your application’s performance with the new features of Java 9 By 作者: Mayur Ramgir – Nick Samoylov ISBN-10 书号: 1789130514 ISBN-13 书号: 9781789130515 出版日期: 2018-03-13 pages 页数: 257 Contents 1: LEARNING JAVA 9 UNDERLYING PERFORMANCE IMPROVEMENTS 2: TOOLS

立即下载
2017 Machine Learning for OpenCV Intelligent image processing with Python

Machine Learning for OpenCV: Intelligent image processing with Python by Michael Beyeler (https://www.amazon.com/Machine-Learning-OpenCV-Intelligent-processing/dp/1783980281/ref=sr_1_1?s=amazon-devices&ie=UTF8&qid=1517710318&sr=8-1&keywords=opencv+machine+learning&dpID=41CKBKW8y4L&preST=_SX258_BO1

立即下载
Software.Application.Development.A.Visual.Cplusplus.MFC.and.STL

Title: Software Application Development: A Visual C++, MFC, and STL Tutorial Author: Bud Fox Ph.D., Tan May Ling M.Sc., Zhang Wenzu Ph.D. Length: 1216 pages Edition: 1 Language: English Publisher: Chapman and Hall/CRC Publication Date: 2012-08-08 ISBN-10: 1466511001 ISBN-13: 9781466511002 Software

立即下载
html+css+js制作的一个动态的新年贺卡

该代码是http://blog.csdn.net/qq_29656961/article/details/78155792博客里面的代码,代码里面有要用到的图片资源和音乐资源。

立即下载
Camtasia 9安装及破解方法绝对有效

附件中注册方法亲测有效,加以整理与大家共享。 由于附件大于60m传不上去,另附Camtasia 9百度云下载地址。免费自取 链接:http://pan.baidu.com/s/1kVABnhH 密码:xees

立即下载
电磁场与电磁波第四版谢处方 PDF

电磁场与电磁波第四版谢处方 (清晰版),做天线设计的可以作为参考。

立即下载
压缩包爆破解密工具(7z、rar、zip)

压缩包内包含三个工具,分别可以用来爆破解密7z压缩包、rar压缩包和zip压缩包。

立即下载
算法第四版 高清完整中文版PDF

《算法 第4版 》是Sedgewick之巨著 与高德纳TAOCP一脉相承 是算法领域经典的参考书 涵盖所有程序员必须掌握的50种算法 全面介绍了关于算法和数据结构的必备知识 并特别针对排序 搜索 图处理和字符串处理进行了论述 第4版具体给出了每位程序员应知应会的50个算法 提供了实际代码 而且这些Java代码实现采用了模块化的编程风格 读者可以方便地加以改造

立即下载
jdk1.8下载

jdk1.8下载

立即下载
身份证号对应籍贯表大全(共6456条)

身份证号对应籍贯表大全(共6456条),可以很方便查出身份证对应的籍贯,方便工作、项目使用

立即下载
DirectX修复工具V3.7在线修复版

DirectX修复工具(DirectX Repair)是一款系统级工具软件,简便易用。本程序为绿色版,无需安装,可直接运行。 本程序的主要功能是检测当前系统的DirectX状态,如果发现异常则进行修复。程序主要针对0xc000007b问题设计,可以完美修复该问题。本程序中包含了最新版的DirectX redist(Jun2010),并且全部DX文件都有Microsoft的数字签名,安全放心。 本程序为了应对一般电脑用户的使用,采用了傻瓜式一键设计,只要点击主界面上的“检测并修复”按钮,程序就会自动完成校验、检测、下载、修复以及注册的全部功能,无需用户的介入,大大降低了使用难

立即下载
同济大学线代第六版PDF高清扫描版

同济大学的线代第六版PDF高清扫描版 要考数学3的同学可以下载看下 上传记录里面还有考数3的其他资源 有需要的可以自行下载

立即下载