Chapter 2
Data Analysis
2013/9/18
3
In God we trust, all
others must bring data.
WILLIAM EDWARDS DEMING
2013/9/18
4
Iris flower data set
◦ First introduced by R. A. Fisher
◦ 150 objects: sepal length/width, petal length/width, species (3 species)
◦ Predict the species of flowers
2013/9/18
5
Iris-setosa
Iris-versicolor Iris-virginica
Chapter 2. Data Analysis
1. Data Exploration
2. Dimensionality Reduction
2013/9/18
6
Data exploration
Preliminary investigation of the data in order to better
understand its specific characteristics
Can aid in selecting the appropriate preprocessing and data
analysis techniques
Can be used to understand, interpret, and justify data analysis
results
Exploratory data analysis (EDA) in statistics: proposed by J.
W. Tukey
Include: summary statistics, visualization, online analytic
processing (OLAP), …
2013/9/18
7