# Practical-Machine-Learning
This book is best for professional data scientists or wanting-to-be data scientists who are looking at learning the fundamentals of Machine Learning Techniques and the most efficient ways of applying and implementing these machine learning techniques on large datasets using the most relevant machine learning frameworks and tools on or off Hadoop platform, given the problem definition, the hands-on way. The readers are expected to have basic programming skills in java and knowledge of any scripting languages will be a bonus.
This book focuses on exploring all the Machine Learning techniques and some specific behavioral differences or implementation intricacies with the parallel or distributed processing approach. Additionally, for each technique along with a deep dive on internals of each algorithm, example implementations using top and evolving machine learning frameworks and tools like R, SPSS, Apache Mahout, Python, Julia and Spark is explained. This book helps readers master Machine Learning techniques and gain ability to identify and apply appropriate techniques in the given problem context. In the context of large datasets, multi-core cluster based learning, distributed learning, parallel computation tools and libraries and more. The readers will be exposed to a list of machine learning frameworks and for each of the frameworks detailed implementation aspects like function libraries, syntax, installation or set-up and integration with Hadoop (wherever applicable) will be covered.
Until recent past, the machine learning community has assumed sequential algorithms on data that fits in memory. This assumption is no longer realistic for many recent scenarios and has brought in some interesting perspectives to Advanced Machine Learning. Despite this growing interest, there haven’t been many publications on how these solutions integrate with our data management systems.
The success of data-driven solutions for complex problems with the dropping infrastructure or storage costs has brought focus on large scale machine learning. Below is a list of topics that will be covered in this book:
1. Learn and master platforms, algorithms, and applications for machine learning techniques classified under supervised, unsupervised, semi-supervised, reinforcement and deep learning.
2. Analyze and prepare large data sets and design your own machine learning system
3. Take a deep dive into each of the machine learning algorithm and learn how to implement in more than one ways (Explore alternative implementation platforms and learn how to rationalize which one to choose), given the problem context.
4. For each of the identified platforms, learn how to set-up environment, load large scale data and explore the syntax and understand the implementation nuances.
5. How does Machine Learning link with Hadoop? Understand Hadoop as a platform for distributed and parallel processing paradigm.
6. For each of the Machine Learning Technique, take a deep dive into the internals of the concept and implement using one or more of the identified tools or libraries that includes Mahout, R, Python, SPSS and Spark. For each of the libraries or framework:
a. Learn to set-up the environment
b. Develop machine learning programs for real world examples,
c. Deploy and execute these programs on large data sets in Hadoop (wherever applicable) to identify precise patterns and predict the outcomes.
This book covers all important machine learning techniques that include:
1. Chapter 5: Decision Tree based learning methods - Decision trees using C4.5, C5.0 and Random Forests
2. Chapter 6: Association rule based learning methods - Apriori and FP-growth
2. Chapter 7: Instance based learning methods - K-Nearest Neighbors
3. Chapter 7: Kernel based learning methods - Supprt Vector Machines
5. Chapter 8: Clustering based learning methods - K means clustering
6. Chapter 9: Bayesian learning methods - Naive Bayes
7. Chapter 10: Regression learning methods - Linear and Logistic regression
8. Chapter 11: Deep learning methods
9. Chapter 12: Reinforcement learning methods - Q-learning
10. Chapter 13: Ensemble methods - Bosstong (Ada, Gradient), Random forests
For each of the learning methods the implementation source code is provided in the following programing languauges
1. Apache Mahout
2. R
3. Spark - MLib
4. Python (sckit-learn)
5. Julia (Java & Scala based)
The project structure is maintained per programming language wise, further by chapter and then specific algorithm.
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
机实用机器学习Practical-Machine-Learning-master (425个子文件)
AnIndex 34B
ua.base 1.71MB
u1.base 1.51MB
libsvm_wrapper.c 14KB
svm.cpp 61KB
R.css 1KB
snsdata.csv 2.51MB
mushrooms.csv 1.16MB
letterdata.csv 696KB
groceries.csv 489KB
sms.csv 466KB
sms_spam.csv 464KB
winequality-white.csv 258KB
wisc_bc_data.csv 123KB
titanic.csv 113KB
titanic.csv 113KB
credit.csv 91KB
credit.csv 91KB
winequality-red.csv 82KB
train.csv 59KB
insurance.csv 53KB
sampledata.csv 45KB
groceryrules.csv 43KB
training.csv 42KB
INTEGRATED-DATASET.csv 41KB
concrete.csv 39KB
test.csv 28KB
predict1.csv 4KB
predict2.csv 4KB
iris.csv 4KB
tsk.csv 58B
numeric.csv 42B
ad.data 9.8MB
u.data 1.89MB
spambase.data 682KB
iris.data 4KB
DESCRIPTION 751B
spambase.DOCUMENTATION 6KB
example-run 227B
.gitignore 50B
svm.h 3KB
00Index.html 1KB
INDEX 220B
FFTConvolutionTest.java 16KB
AutoencoderGradient3.java 10KB
FeatureExtractionTest.java 10KB
AutoencoderLinAlgebra.java 9KB
RandomForest.java 9KB
TwoLayersTest.java 8KB
KNearestNeighbor.java 8KB
AutoencoderLineSearch.java 7KB
FrequentPatternMiningJava.java 7KB
MahoutClusteringExample.java 6KB
AutoencoderFct.java 6KB
LoadSaveModelTest.java 5KB
FrequentPatternMetrics.java 5KB
OneLayerTest.java 5KB
InputDriver.java 5KB
PreProcessTest.java 4KB
ThreeLayerTest.java 4KB
Autoencoder.java 4KB
LinAlgebraIOUtilsTest.java 4KB
AutoencoderConfig.java 4KB
MaxPoolerTest.java 3KB
AutoencoderTest.java 3KB
ExtractPatchesTuplesTest.java 3KB
Hadoop.java 3KB
Recommenders.java 3KB
ExtractPatchesTest.java 3KB
RecommenderEvaluator.java 3KB
NaiveBayes.java 3KB
LogisticRegreesionBase.java 3KB
LogisticRegressionBase.java 3KB
RankTest.java 2KB
LogisticRegressionApp.java 2KB
ItemRecommender.java 2KB
SlopeOneBasedRecommender.java 1KB
DataPreprocessing.java 1KB
AutoencoderFctGrd.java 1KB
WeightedMatrixTest.java 1KB
Utilities.java 1KB
AutoencoderParams.java 979B
AutoencoderSigmoid.java 800B
LogisticRegressionTest.java 794B
RandomForestTest.java 788B
FPgrowthTest.java 774B
KMeansTest.java 769B
AutoencoderComputedParams.java 734B
NaiveBayesTest.java 687B
AutoencoderLearner.java 620B
DecisionTree.jl 13KB
measures.jl 8KB
split.jl 7KB
autoencoder.jl 7KB
tree.jl 6KB
fpgrowth.jl 6KB
knn.jl 5KB
neural-network.jl 5KB
util.jl 4KB
logisticregression.jl 4KB
共 425 条
- 1
- 2
- 3
- 4
- 5
资源评论
cnhero21
- 粉丝: 0
- 资源: 6
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功