没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
Alice Zheng & Amanda Casari
Feature
Engineering
for Machine Learning
PRINCIPLES AND TECHNIQUES FOR DATA SCIENTISTS
Alice Zheng and Amanda Casari
Feature Engineering for
Machine Learning
Principles and Techniques for Data Scientists
Boston Farnham Sebastopol
Tokyo
Beijing Boston Farnham Sebastopol
Tokyo
Beijing
978-1-491-95324-2
[LSI]
Feature Engineering for Machine Learning
by Alice Zheng and Amanda Casari
Copyright © 2018 Alice Zheng, Amanda Casari. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (http://oreilly.com/safari). For more information, contact our corporate/insti‐
tutional sales department: 800-998-9938 or corporate@oreilly.com.
Editors: Rachel Roumeliotis and Jeff Bleiel Indexer: Ellen Troutman
Production Editor: Kristen Brown Interior Designer: David Futato
Copyeditor: Rachel Head Cover Designer: Karen Montgomery
Proofreader: Sonia Saruba Illustrator: Rebecca Demarest
April 2018: First Edition
Revision History for the First Edition
2018-03-23: First Release
See http://oreilly.com/catalog/errata.csp?isbn=9781491953242 for release details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Feature Engineering for Machine
Learning, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc.
While the publisher and the authors have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility
for errors or omissions, including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to ensure that your use
thereof complies with such licenses and/or rights.
Table of Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1.
The Machine Learning Pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Data 1
Tasks 1
Models 2
Features 3
Model Evaluation 3
2.
Fancy Tricks with Simple Numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Scalars, Vectors, and Spaces 6
Dealing with Counts 8
Binarization 9
Quantization or Binning 10
Log Transformation 15
Log Transform in Action 19
Power Transforms: Generalization of the Log Transform 23
Feature Scaling or Normalization 29
Min-Max Scaling 30
Standardization (Variance Scaling) 31
ℓ
2
Normalization 32
Interaction Features 35
Feature Selection 38
Summary 39
Bibliography 39
3.
Text Data: Flattening, Filtering, and Chunking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Bag-of-X: Turning Natural Text into Flat Vectors 42
iii
剩余216页未读,继续阅读
资源评论
thundertide
- 粉丝: 1
- 资源: 8
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于Django和HTML的新疆地区水稻产量影响因素可视化分析系统(含数据集)
- windows conan2应用构建模板
- 3_base.apk.1
- 基于STM32F103C8T6的4g模块(air724ug)
- 基于Java技术的ASC学业支持中心并行项目开发设计源码
- 基于Java和微信支付的wxmall开源卖票商城设计源码
- 基于Java和前端技术的东软环保公众监督系统设计源码
- 基于Python、HTML、CSS的crawlerdemo软件工程实训爬虫设计源码
- 基于多智能体深度强化学习的边缘协同任务卸载方法设计源码
- 基于BS架构的Java、Vue、JavaScript、CSS、HTML整合的毕业设计源码
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功