pattern classification

5星(超过95%的资源)
所需积分/C币:9 2015-11-26 15:04:31 11.28MB PDF
16
收藏 收藏
举报

pattern classification
CONTENTS PREFACE x 1 INTRODUCTION I. Machine Perception, I 1. 2 An Exampl dIe. l 12 i Related fielals。8 1.3 Pattern Recognition Systems, 9 3. I Sensing, 9 3. 2 Segmentation and Grouping, 9 1.3.3 Fealure Extraction, II 3. 4 Classification. 12 1.3.5 Post Processing, 1. 4 The Deign Cycle. 14 4.1 Data Collection. 14 4,2 Feature Choice 14 1.4 3 Model Choice. 15 14.4 Training, 15 4.5 Evaluation. 15 4.6 Computational Complexity, 16 1.5 Leaming and Adaptation, 16 1.5. 1 Supervised Learning, 16 S,2 Unsupervised Learning. 17 1.5.3 Reinforcement Leaming, 17 I 6 Conclusion. 17 Summary by Chaplers, 17 Bibliographical and Historical Remarks, Is Bibliography. I9 2 BAYESIAN DECISION THEORY 20 2.1 Introduction, 20 2.2 Bavesian Decision Theory Continuous Features, 24 12 Two- Cate orv Classitication, 25 2.3 Minimum-Error-Rate Classification. 26 s2.3.1 Minimax Criterion. 27 III CONTENTS 12.3.2 Neyman Pearson Criterion, 28 2. 4 Classitiers, Discriminant Functions, and Decision Surfaces, 29 24.1 The Multicategory Case, 29 2.42 The Two Category Case, 30 2.5 The Normal Density, 31 2.5.1 Univariate Density. 32 25.2 Multivariate Density, 33 2.6 Discriminant Functions for the Normal Density. 36 261Casc1:=σ2L36 262Case2:卫39 2.6.3 Case 3: 3=arbitrary, 41 Examnple I Decision Regions for Two-Dimensional Gaussian Data 4l 82.7 Error Probabilities and Integrals, 45 2.8 Error Bounds for normal densities, 46 2.8.1 Chernoff Bound, 46 2.8.2 Bhattacharyya Bound, 47 Example 2 Error Bounds for Gaussian Distributions. 48 2.8.3 Signal Detection Theory and Operating Characteristics, 48 29 Bayes Decision Theory-Diserete Features, 5I 2.9.1 Independent Binary Features, 52 ample 3 Bayesian Decisions for Thrcc-Dimcnsional Binary Data, 53 *2.10 Missing and Noisy Features. 54 2. 10.1 Missing Features, 54 2.10.2 Noisy Features, 55 *211 Bayesian Belief Networks, 56 Example 4 Belief Network for Fish, 59 2. 12 Compound Bayesian Decision Theory and Context, 62 Summary, 63 Bibliographical and Histoncal Remarks, 6 Problems, 65 Compuler exercises。8 Bibliography. 8? MAXIMUM-LIKELIHOOD AND BAYESIAN 3 PARAMETER ESTIMATION 84 3.1 Introduction, 84 3.2 Maximum-Likelihood Estimation. 85 3.2.1 The General Principle. 85 3.2.2 The Gaussian Case: Unknown u. 88 3.2.3 The Gaussian Case: Unknown u and >. 8s 3.24Bias,89 3.3 Bayesian Estimation. 9 3.3.1 The Class-Conditional Densities, 91 33. 2 The Parametcr Distribution, 91 3.4 Bayesian Parameter Estimation: Gaussian Case, 92 34.1 The Univariate Case: P(p D), 92 3, 4.2 The Univariate Case: P(rD), 95 3.4.1 The Multivariate Cave, 95 CONIENTS夜 3.5 Bavesian Parameter Estimation: Gencral Theory, 97 Example 1 Recursive Bayes Leaming, 98 3.5.1 When Do Maximum-Likelihood and Bayes Methods Differ, 100 3.5.2 Noninformative Priors and Invananee, IO1 3.5.3 Gibbs Algorithm, 102 #3. 6 Sufficient Statistics, 102 3.6, 1 Sutticient Statisties and the Exponential Family. 105 3.7 Problems of Dimensionality, 107 3.7.1 Accuracy, Dimension, and Training Sample Size, 107 3. 7.2 Computational Complexity. I11 3.7.3 Overfitting, 113 #3. 8 Component Analysis and Discriminants, 114 3, 8. 1 Principal Component Analysis(PCA),115 38, 2 Fisher Linear Discriminant, 117 3.8.3 Multiple Discriminant Analysis, 121 3.9 Expectation-Maximization(EM). 124 Example 2 Expectation-Maximization for a 2D Normal Model, 126 3. 10 Hidden Markov Models, 128 3.10.1 First-Order markov Models 128 3, 10,2 FirM-Order Hidden markor Models, 29 3. 10.3 Hidden Markow Model Computation, 129 3. 10. 4 Evaluation, 131 Example 3 Hidden Markow Model, 133 3. 10.5 Decodin. 135 Example HMM Decoding. 130 3,0.6L∠ aIn.137 Summary, 139 Bibliographical and Historical Remarks, 139 Problems, 1440 Computer exercises, 155 Bibliography, 159 4 NONPARAMETRIC TECHNIQUES 161 4 Introduction. 161 4. 2 Density Estimation, 161 4 3 Parzen Windows, I64 4.3.1 Convergence of the Mean, 167 4.3.2 Convergente of the Variance, 167 433Ⅲ stations,l68 4.3.4 Classification Example, I68 43.5 Probabilistie Neural Networks(PNNs), 172 4.4 k-Nearest-Neighbor Estimation, V9 I7 4.3.6 Choosing the Window Function, 174 4.4.2 Estimation of A Posteriori Probabilities,> Mimation,176 4.4.1 An-Nearest Neighbor and Parzen- Window E 4.s The Nearest-Neighbor Rule, 177 4.5.1 Convergence of the Nearest Neighbor, 179 4.5.2 Error Rate for the Nearest-Neightxor Rule, 180 45.3 Error Bounds. 180 45, 4 The k-Nearest Neighbor Rule, 182 x CONTENTS 4.5.5 Computational Complexity of the k-Nearesl.Neighbor Rule. 184 4. 6 Metrics and Nearest- Neighbor Classification. 187 4.6.1 Properties ol Metrics. 187 4. 6.2 Tangent Distance, IS8 +4.7 Fuzzy Classification, 192 4.8 Reduced Coulomb Energy Networks, 195 4,9 Approximations by Series Expansions, 97 Summary, 19y Bibliographical and Historical Remarks, 2[K Problems, 201 Computer exercises, 20y Bibliography, 213 5 LINEAR DISCRIMINANT FUNCTIONS 215 5.1 Introduction. 215 5.2 Linear Discriminant Functions and Decision Surfaces. 216 5.2.1 The Two-Category Case, 216 5.2.2 The Multicategory Case, 218 3.3 Generalized Lincar Discriminant Function\, 219 5.4 The Two- Category Linearly Separable Case, 223 5.4.1 Geometry and Terminology. 224 5. 4.2 Gradient Descent Procedures, 224 5.5 Minimizing the Perceptron Criterion Function, 227 5.5.1 The Pereptron Cnterion Funetion. 227 5.5.2 Convergence Proof for Single-Sample Correction, 229 s, 5. 3 Some Direet Generalizations, 232 5.6 Relaxation Procedures, 235 s, 6.1 The Descent Aleonth 235 5, 6,2 Convengence Proof, 237 5.7 Nonseparable Behavior, 238 3.8 Minimum Syuured-Error Procedures. 23y 3.8.1 Minimum Squired- Ermor and the Pseudoinverse. 240 Example I Constructing a Linear Classi fier by Matrix Pseudoinverse. 2411 5.8.2 Relation to Fishers Linear Discriminant, 242 5.8 3 Asymptotic Approximation to an Optimal Discriminant, 243 3.84 The Widrow-Hoff or LMS Predure 245 5.8.5 Stochastic Approximation Methods, 246 5.9 The Ho-Kashyap Procedures, 249 5.91 The Descent procedure 250 3.9.2 Comcrgcnicc Proof, 251 5.9.3 Nonseparable Behavior. 253 594 Some Related proeedures, 255 5.10 Lincar Programming Algorithms, 256 5.I0,1 Linear Programming, 256 5.10.2 The Linearly Separable Case, 257 3. 10.3 Minimizing the Perceptron Criterion Funetion, 258 5.11 Support Vector Machines, 259 5.11.1 SVM Training, 263 Example 2 SYM for the XOR Problem. 264 CONTENTS 5.12 Multicategory Generalizations. 265 5.12.1 Kesler's Construction. 265 5.12.2 Convergence of the Fixed-Increment Rule, 26 5.12.3 Generalizations for MSE Prxddures, 268 Summary, 269 Bibliographical and Historical Remarks, 270 Problems. 271 Computer exercises, 278 Bibliography, 281 6 MULTILAYER NEURAL NETWORKS 282 5.1 Introduction, 282 6.2 Feedforward Operation and Classification, 28-4 6.2.1 General Feedforward Operation. 286 6.2.2 Expressive Power of Multilayer Networks, 287 6.3 Backpropagation Algorithm, 283 63 1 Network Leaming, 289 6.3.2 Training Protocols, 293 6.3.3 Larming Cures, 295 6.4 Error Surfaces. 296 6.41 Some Small Nctworks. 296 6.4.2 The Exclusive-OR(XOR), 298 6.4, 3 Later Networks. 298 6.5 Backpropagation as Feature Mapping, 299? 64.4 How Important Are Multiple Minima?. 299 6.5.1 Representations at the Hidden Layer-Weights, 302 6.6 Backpropagation, Bayes Theory and Probabilily, 303 6.6.1 Baves Discriminants and Neural Networks. 303 6. 6. 2 Outputs as Probabilities, 304 *6.7 Related Statistical Techniques, 305 6.8 Practical Techniques for Improving Backpropagation, 306 6.8. 1 Activation Function. 307 6.8.2 Parameters for the Sigmoid. 30S 6.8.3 Scaling Input. 308 6.8 4 Target Values. 309 6. 8.5 Training with Noise. 310 6.8.6 Manufacturing Data, 310 6. 8. 7 Number of hidden Units, 310 6. 8.8 Initializing Weights, 311 6,8,9 Leaming rates, 312 68.10 Momentun。313 5.8, 11 Weight Decay. 314 68,12Hins,315 6.8. 13 On-Line, Stochastic or Batch Training?: 316 6. 8. 14 Stapped Training, 316 6.8, 15 Number of Hidden Layers, 317 6. 816 Criterion Function, 318 *6.9 Second-Order Methods, 318 69 1 Hessian Matrx。3ls 6.9.2 Newton's Method. 319 KI CONTENTS 6.9.3 Quickprop, 320 6.9.4 Conjugate Gradient Descent, 321 Example I Conjugate Gradient Descent 322 46.10 Additional Networks and Training Methods. 324 6. 0, Radial Basis Function Networks (RBFs). 324 6. 10.2 Special Bases, 325 6. 10.3 Matched Filters, 325 6.10 4 Convolutional network. 326 6. 10.5 Recurrent Networks, 328 6. 10.6 Cascade-Correlation, 329 6.11 Regularization Complexity Adjustment and Pruning. 330 Summary. 333 Bibliographical and Historical Remarks, 333 Problems, 335 Computcr exercises. 343 Bibliography, 347 7STOCHASTIC METHODS 350 1.1 Introduction, 350 7.2 Stochastic search 351 7.2.1 Simulated Annealing, 351 7.2.2 The Boltzmann Factor, 352 7.2.3 Deterministic Simulated Annealing. 357 7.3 Boltzmann Leaning. 360 1.3.1 Stachastic Boltzmann Leaning of visible States, 360 7.3.2 Missing Features and Category Constraints. 365 7.3.3 Deterministic Bolzmann Leaming, 366 7.3.4 Initialization and Sctting Parameters, 367 74 Boltzmann Networks and Graphical Models, 370 7, 4. I Other Graphical Models, 372 a7.5 Evolutionary Methods, 373 7.5.1 Genctic Algorithms, 373 7.5, 2 Further Heuristics, 377 7.5.3 Why Do They Work?. 378 7.6 Genetic Programming, 378 Summarv, 3SI Bibliographical and Historical Remarks, 381 Problems, 38t Computer exercises, 388 Bibliography, 391 8 NONMETRIC METHODS 394 s 1 Introduction. 395 8.2 Decision Trees. 395 8.3CART.396 83.I Number of Splits, 397 8.3.2 Query Sclcction and Node Impurity, 398 8.3.3 When to Stop Splitting. 402 8.3, 4 Pruning. 403 CONTENTS xiii 8.3.5 Assienment of Leaf Node Labels, at Example I A Simple Tree, 40 83.6 Computational Complexity, 406 8.3. 7 Feture Choice. 407 8. 3. 8 Multivariatc Decision Trees, 4(25 8.3.9 Pr LOTs d Costs, uur 8.3.10 Missing Attributes, 409 Example 2 Surrogate Splits and Missing Attributes, 410 8.4 Other Tree Methods, 411 S41 ID3. 411 84.2C45,411 8a.3 which Tree Classifier Is Best2. 412 *8.5 Recognition with Strings, 413 8.5.1 String Matching, 415 8.5.2 Edit Distance. 418 8, 5,3 Computational Complexity, 420 X. 4 String Matching with Erors, 420 8.5.5 String Matching with the"Don t-Cart"Symbol. 421 8.6 Grammatical Methods, 421 st1 Grammars, 422 8.6.2 Types of String Grammars, 424 Exumple 3 A Grammar for Pronouncing Numbers. 425 8.6, 3 Recognition Using Grammars, 426 8.7 Grammatical Inference, 429 Examnle Grammatical Inference. 431 8.8 Rule Based Methods, 431 8.8.1 Learming Rules, 433 Summary. 4.34 Bibliographical and Historical Remarks, 435 Problems, 437 Computer exercises, 446 Bibliography, 450 9 ALGORITHM-INDEPENDENT MACHINE LEARNING 453 9.1 Introcuction, 453 9.2 Lack of Inherent Superiority of Any Classifier. 454 9.2.1 No Free Lunch Thcorem. 454 Example I No Free Lunch for Binary Data, 457 9.2.2 Ugly Duckling Theorem, 458 9.2.3 Minimum Description Length (MDL). 461 9.24 Minimum Description Length Principle. 463 9.2.5 Overfitting Avoidance and Occams Razo, 46+4 9.3 Bias and variance, 465 9.3.1 Bias and Variance for Regression. 466 9.3.2 Bias and Variance for Classification. 468 9.4 Resampling for Estimating Statistics, 47 9.4.1 Jackknife, 472 Example 2 Jackknife Estimate of Bias and Variance of the Mode, 473 9.4.2 Bootstrap, 474 9.5 Resampling for Classifier Design, 475 KiV CONTENIS 9.5.1 Baggins. 475 9.5.2 HOMine. 476. 9.5, 3 Learning with Queries, 480) 9.5.4 Arcing, Leaming with Queries, Bias and Variance, $2 9.6 Estimating and Comparing Classitiers, 482 9.5.1 Parametric Models, 483 9.6.2 Cross ValidatioN. 481 9.6.3 Jackknife and Bootstrap Estimation of Classification Accuracy, 485 9.4.4 Maximum- Likelihood Model Comparison. 4so 9.6.5 Bayesian Model Conparison, 487 9.6.6 Tie Problen Average Error Rate, 489 9.6.7 Predicting Final Performance from Learning Cures, 492 9.6.8 The Capacity of a Separating Plane, 494 9.7 Combinin Classifiers, 95 9.7.1 Component Classitiers with Discnminant Funstions. 496 9.7.2 Component Classifiers without Discriminant Functions, 498 Summary, 499 Bibliographical and Historical Remarks, 5( Problems, 502 Computer exercises, S0S Bibliography. 513 10 UNSUPERVISED LEARNING AND CLUSTERING 517 10.1 Introduction. 517 10.2 Mixture Densities and Identifiability. 518 10 3 Maximum Likelihoo Estimates, 519 10.4 Application to Normal Mixtures, 521 104.1 Case I- Unknown Mean Veclor, 52) 104.2 Cave 2: All Parameters Unknown. 524 101 43 AMeans Clustering, $26 10.4.4 Fuzzy A-Mcans Clustering. 52& 0.5 Unsupervised Bayesian Leiming, 5.3 10.5.1 The Bayes Classifier. 530 10.53 Learming the Paratmeter Veclor, 531 E\ample I Unsupervised Learning of Gaussian Data, 534 10.5.3 Decision -Directed Approximation. $36 10.6 Data Description and Clusering. 537 10.6.1 Similarity Measures. 535 10,7 Criterion Functions for Clustering. 542 10.7.1 The Sum-of-Squared-Error Critcrion, 542 10.7.2 Related Minimum Variance Criteria, 543 10.7 Sealer Criteria. $44 Exanple 2 Clustering Criteria, 546 10.8 Iterative Optimization, S4s 10.9 Hierarchical Clustering, 551 10.4.1 Dctinitions, 551 10.92 Agglomerative Hierarchical Clustering, 552 10.9.3 StepwiseOptimal Hierarchical Clustering. 555 10.9 Hierarchical Clustering and Induced Metrics. 556 10. 10 The Problem of validity, 557

...展开详情
立即下载 身份认证后 购VIP低至7折
一个资源只可评论一次,评论内容不能少于5个字
gEmfP2b patternclassification还行,可以借鉴一下,就是下载太慢了,网速不给力,刚才也是,
2015-12-12
回复
qq_33061035 对于新手来说 很不错
2015-12-02
回复
您会向同学/朋友/同事推荐我们的CSDN下载吗?
谢谢参与!您的真实评价是我们改进的动力~
关注 私信
上传资源赚钱or赚积分
最新推荐
pattern classification 9积分/C币 立即下载
1/0