Apache Mahout Cookbook
Table of Contents
Apache Mahout Cookbook
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Mahout is Not So Difficult!
Introduction
Installing Java and Hadoop
Getting ready
How to do it...
Setting up a Maven and NetBeans development environment
Getting ready
How to do it...
How it works...
There's more...
Coding a basic recommender
Getting ready
How to do it...
How it works...
See also
2. Using Sequence Files – When and Why?
Introduction
Creating sequence files from the command line
Getting ready
How to do it...
How it works...
Generating sequence files from code
Getting ready
How to do it...
How it works...
Reading sequence files from code
Getting ready
How to do it…
How it works…
3. Integrating Mahout with an External Datasource
Introduction
Importing an external datasource into HDFS
Getting ready
How to do it...
How it works...
There's more...
Exporting data from HDFS to RDBMS
How to do it…
How it works...
Creating a Sqoop job to deal with RDBMS
How to do it...
How it works...
There's more...
Importing data using Sqoop API
Getting ready
How to do it…
How it works...
4. Implementing the Naϊve Bayes classifier in Mahout
Introduction
Using the Mahout text classifier to demonstrate the basic use case
Getting ready
How to do it…
How it works...
There's more
Using the Naïve Bayes classifier from code
Getting ready
How to do it…
How it works...
There's more
Using Complementary Naïve Bayes from the command line
Getting ready
How to do it…
How it works…
See also
Coding the Complementary Naïve Bayes classifier
Getting ready
How to do it…
How it works...
5. Stock Market Forecasting with Mahout
Introduction
Preparing data for logistic regression
Getting ready
How to do it…
How it works…
Predicting GOOG movements using logistic regression
Getting ready
How to do it…
How it works…
The confusion matrix
Using adaptive logistic regression in Java code
Getting ready
How to do it…
How it works…
Using logistic regression on large-scale datasets
Getting ready
How to do it…
How it works...
See also
Using Random Forest to forecast market movements
Getting ready
How to do it…
How it works…
See also
6. Canopy Clustering in Mahout
Introduction
Command-line-based Canopy clustering
Getting ready
How to do it…
How it works...
Command-line-based Canopy clustering with parameters
Getting ready
How to do it…
How it works...
Using Canopy clustering from the Java code
Getting ready
How to do it…
How it works...
Coding your own cluster distance evaluation
Getting ready
How to do it…
How it works...