没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
Alice Zheng & Amanda Casari
Feature
Engineering
for Machine Learning
PRINCIPLES AND TECHNIQUES FOR DATA SCIENTISTS
Alice Zheng and Amanda Casari
Feature Engineering for
Machine Learning
Principles and Techniques for Data Scientists
Boston Farnham Sebastopol
Tokyo
Beijing Boston Farnham Sebastopol
Tokyo
Beijing
978-1-491-95324-2
[LSI]
Feature Engineering for Machine Learning
by Alice Zheng and Amanda Casari
Copyright © 2018 Alice Zheng, Amanda Casari. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (http://oreilly.com/safari). For more information, contact our corporate/insti‐
tutional sales department: 800-998-9938 or corporate@oreilly.com.
Editors: Rachel Roumeliotis and Jeff Bleiel Indexer: Ellen Troutman
Production Editor: Kristen Brown Interior Designer: David Futato
Copyeditor: Rachel Head Cover Designer: Karen Montgomery
Proofreader: Sonia Saruba Illustrator: Rebecca Demarest
April 2018: First Edition
Revision History for the First Edition
2018-03-23: First Release
See http://oreilly.com/catalog/errata.csp?isbn=9781491953242 for release details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Feature Engineering for Machine
Learning, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc.
While the publisher and the authors have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility
for errors or omissions, including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to ensure that your use
thereof complies with such licenses and/or rights.
Table of Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1.
The Machine Learning Pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Data 1
Tasks 1
Models 2
Features 3
Model Evaluation 3
2.
Fancy Tricks with Simple Numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Scalars, Vectors, and Spaces 6
Dealing with Counts 8
Binarization 9
Quantization or Binning 10
Log Transformation 15
Log Transform in Action 19
Power Transforms: Generalization of the Log Transform 23
Feature Scaling or Normalization 29
Min-Max Scaling 30
Standardization (Variance Scaling) 31
ℓ
2
Normalization 32
Interaction Features 35
Feature Selection 38
Summary 39
Bibliography 39
3.
Text Data: Flattening, Filtering, and Chunking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Bag-of-X: Turning Natural Text into Flat Vectors 42
iii
剩余216页未读,继续阅读
资源评论
thundertide
- 粉丝: 1
- 资源: 8
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功