基于Python进行模式识别（Fisher线性判别）【100012070】_fisher二分类器python代码资源-CSDN文库

共14个文件

py：2个

data：2个

rocks：1个

版权申诉

Python

模式识别

Fisher

线性判别

5星 · 超过95%的资源 157 浏览量 2023-04-28 14:01:45 上传评论 1 收藏 1.02MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

100012070-基于Python进行模式识别（Fisher线性判别）.zip （14个子文件）

fisher

sonar

sonar.rocks 42KB

sonar.all-data 86KB

sonar.mines 48KB

LDA_Fisher.py 5KB

Index (1) 178B

LICENSE 1KB

17170120015-任俊杰-模式识别大作业（1）.pdf 667KB

iris

bezdekIris.data 4KB

Index 105B

iris.data 4KB

LDA_Fisher.py 5KB

iris.names 3KB

README.md 19KB

Fisher线性判别.docx 531KB

# Fisher 线性判别 ## 一、Fisher 线性判别算法介绍 ### 1.1 介绍 Fisher 两类的判别问题可以看作是把所有样本都投影到一个方向上，然后在这个一维空间中确定一个分类的阈值。过这个阈值点且与投影方向垂直的超平面就是两类的分类面。问题是如何根据实际情况找到这条最好的、最易于分类的投影线，这就是 Fisher 线性判别算法要解决问题。 Fisher 线性判别的思想就是：选择投影方向，使投影后两类相隔尽可能远，而同时每一类内部的样本又尽可能聚集。以下部分仅讨论两类问题。 ### 1.2 Fisher 准则函数中的基本参量（1）样本 ① 训练样本集是(每个样本是一个 d 维向量)： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/d8b99c830711d5d465969586331ed99c.writebug)， ② 其中![](https://www.writebug.com/myres/static/uploads/2021/12/16/e703522819b09d05f4774aea5d0b37fc.writebug)类的样本是： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/82a4788a53ee0513b34de5ddc80cc4c2.writebug) ③ 其中![](https://www.writebug.com/myres/static/uploads/2021/12/16/669023183d743ef5ba93620653c75133.writebug)类的样本是： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/8f199b4fc555de2956c7455f6ec4e48c.writebug) 目标:寻找一个投影 w（w 也是一个 d 维列向量）,使得投影后的样本变成： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/b4cdd2b906f03946ede46cf555d11208.writebug) （2）在原来的样本空间 **①** 类均值向量为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/b9e5770e62ff8114afe6c86c69b65c79.writebug) **②** 各类的类内离散度矩阵(within-class scatter matrix)为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/9dd4042871b0c528a9ced2df9d93750c.writebug) **③** 总类内离散度矩阵(pooled within-class scatter matrix)为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/855adb3a64f350f1576beae6b4845581.writebug) ④ 类间离散度矩阵(between-class scatter matrix)定义为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/322b8674c3b9a364f0ec6e5d233bf666.writebug) （3）在投影以后的一维空间 ① 两类均值分别为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/6f4fa8ac3c220e86737fb6840fc80bef.writebug) ② 类内离散度矩阵为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/3838783a771734a3f4945cd2ead540a1.writebug) ③ 总类内离散度矩阵为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/46b1f796c1d116215c0c6d58706cafda.writebug) ⑤ 类间离散度矩阵为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/5425c758e58594cb49a4992c8f45197b.writebug) ### 1.3 衡量标准与分类两类判别，就是希望寻找的投影方向使投影以后两类尽可能分开，而各类内部又尽可能聚集，这一目标可以表示成如下的准则： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/66f368b89caa74fcb68bc68438d3db97.writebug) 这就是 Fisher 准则函数(Fisher's Criterion)，可变换为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/33fd42fbb0d1f922cb3e2bdc5915be9f.writebug) 可求解得 Fisher 判别准则下的最佳投影方向： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/ac5c67a3343cad355ecabf1bd56e6fc1.writebug) 若不考虑先验概率，阈值![](https://www.writebug.com/myres/static/uploads/2021/12/16/1927f220e9629ee702fec076dea805f2.writebug)可按以下规则选取： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/d9326b039184d051c15b064876f71e9e.writebug) 两类线性判别的一般决策规则为： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/d890d04d6e3cec9780e7df124e865e98.writebug) 若考虑先验概率，决策规则可以写成： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/8d714af6f1f3e33ac09725c74636b976.writebug) ![](https://www.writebug.com/myres/static/uploads/2021/12/16/f24cbe51d9497215a706f05d9a3ab1b0.writebug) ## 二、实验数据集介绍 ### 2.1 Iris 数据集介绍 ![](https://www.writebug.com/myres/static/uploads/2021/12/16/71c198b276820da928e90d04062a3e4c.writebug) ### 2.2 Sonar 数据集介绍 ![](https://www.writebug.com/myres/static/uploads/2021/12/16/d697ad24bf35f05370b181b688037746.writebug) ## 三、实验设置对两类数据集，sonar 是两分类问题，直接按决策公式判别；Iris 是三类问题，需要求出样本之间两两组合的判别函数，按照第二种情况进行三分类（每个模式类和其他模式类之间分别用判别点分开）。两类数据集处理方法具有相似性，因此实验设置基本一致，具体如下： 1、读取数据信息。首先从数据集文件（iris.data、sonar.all-data）中读取数据，将读取到的数据按照一定比例（默认 2/5 测试）随机存放到训练集和测试集中，再将训练集中数据按标签分类（Iris 分三类、sonar 分两类）。 2、求各类样本的基本参量。根据 1.2（2）中的 ①②③ 三个公式，分别求解类样本的均值向量、类内离散度矩阵、两两样本间总类内离散度矩阵（Iris 中需要求解![](https://www.writebug.com/myres/static/uploads/2021/12/16/fee591c025d7f8c309152d57d0cb0b24.writebug)，sonar 中求解![](https://www.writebug.com/myres/static/uploads/2021/12/16/aeccf3d105bb966ef4135749fa2cf5ae.writebug)）。 3、求解权向量和阈值。根据 1.3 中相应公式，求解样本间的权向量和阈值（Iris 三类样本，需要求解两两样本间的参数，一共求解三次；sonar 只需要求解一次），求解完毕之后即可按照第二类情况规则进行分类。 4、绘图验证训练效果。利用判别函数分别将各类样本降至一维，按权向量方向投影至坐标轴，并分颜色绘制各点位置（Iris 分别按照三个投影方向绘制三次），观察不同色点的分布情况，观察训练效果。 5、用测试集测试并计算准确率，绘图显示分类效果。 6、重复 1-5 步 20 次，计算准确率，绘图显示。 ## 四、实验结果展示与分析 ### 4.1 Iris 数据集分类结果分析 ![](https://www.writebug.com/myres/static/uploads/2021/12/16/864a4101a2d94bbf6e144185c203095d.writebug) 1、首先将三类样本数据两两组合，利用权向量和阈值将 4 维数据降至 1 维，在坐标轴上进行绘制，结果展示如下： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/6c6acab4a19dff63b64b0d7be5687c7f.writebug) ![](https://www.writebug.com/myres/static/uploads/2021/12/16/f6f4adc842407f1175f2d827e115d4aa.writebug) 从上述图示可以看出：经训练后，两两样本投影至一维分别聚集到轴的两侧，可以寻找到非常明确的分类点分开两类样本。 2、将训练样本分别按 G12、G13、G23（三类样本两两之间的判别函数）进行计算，按照 1.3 中判别规则进行判别，并将降维后的数据分别投影在 Y=3、Y=2、Y=1 三条轴上，通过青（3）、紫（2）、蓝（1）三类数据区别三类数据，分类点均为 X=0，结果如下图： ![](https://www.writebug.com/myres/static/uploads/2021/12/16/c958acd133155dbd2d013fd1cf3d8819.writebug) 从上图可以看出按照相对应的判别函数 g，两类测试样本均能在分类点 X=0 两侧很好的区分开来，甚至也能够区分第三类数据。 3、根据第二类分类情况（G12>0、G13>0，则属于第一类；G12<0、G23>0，则属于第二类，G23<0、G13<0，则属于第三类）的分类规则，对测试样本进行分类，对比已知标签，分类正确和错误次数，计算正确率，重新测试二十次，计算平均准确率。结果如下 ![](https://www.writebug.com/myres/static/uploads/2021/12/16/7e4682a1ae93390aea073

评论收藏

内容反馈

版权申诉