When teaching a technical topic, it is important to show the application of the concepts
discussed to real-life problems. For this reason, we present machine learning within the
context of predictive data analytics, an important and growing industry application
machine learning. The link between machine learning and data analytics runs through
every chapter in the book. In Chapter 1 we introduce machine learning and explain the
role it has within a standard data analytics project lifecycle. In Chapter 2 we provide a
framework for designing and constructing a predictive analytics solution, based on
machine learning, that meets a business need. All machine-learning algorithms assume a
dataset is available for training, and in Chapter 3 we explain how to design, construct, and
quality check a dataset before using it to a build prediction model.
Chapters 4, 5, 6, and 7 are the main machine learning chapters in the book. Each of
these chapters presents a different approach to machine learning: Chapter 4, learning
through information gathering; Chapter 5, learning through analogy; Chapter 6, learning
by predicting probable outcomes; and Chapter 7, learning by searching for solutions that
minimize error. All of these chapters follow the same two part structure
Part 1 presents an informal introduction to the material presented in the chapter,
followed by a detailed explanation of the fundamental technical concepts required to
understand the material, and then a standard machine learning algorithm used in that
learning approach is presented, along with a detailed worked example.
Part 2 of each chapter explains different ways that the standard algorithm can be
extended and well-known variations on the algorithm.
The motivation for structuring these technical chapters in two parts is that it provides a
natural break in the chapter material. As a result, a topic can be included in a course by
just covering Part 1 of a chapter (‘Big Idea’, fundamentals, standard algorithm and worked
example); and then—time permitting—the coverage of the topic can be extended to some
or all of the material in Part 2. Chapter 8 explains how to evaluate the performance of
prediction models, and presents a range of different evaluation metrics. This chapter also
adopts the two part structure of standard approach followed by extensions and variations.
Throughout these technical chapters the link to the broader predictive analytics context is
maintained through detailed and complete real-world examples, along with references to
the datasets and/or papers that the examples are based on.
The link between the broader business context and machine learning is most clearly
seen in the case studies presented in Chapters 9 (predicting customer churn) and 10
(galaxy classification). In particular, these case studies highlight how a range of issues and
tasks beyond model building—such as business understanding, problem definition, data
gathering and preparation, and communication of insight—are crucial to the success of a
predictive analytics project. Finally, Chapter 11 discusses a range of fundamental topics in
machine learning and also highlights that the selection of an appropriate machine learning
approach for a given task involves factors beyond model accuracy—we must also match
the characteristics of the model to the needs of the business.