基于Python实现的手写数字识别系统【100011202】_手写数字识别系统资源-CSDN文库

共10个文件

gz：4个

txt：1个

py：1个

版权申诉

Python

课程设计

5星 · 超过95%的资源 199 浏览量 2023-03-10 09:07:31 上传评论收藏 12.34MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

100011202-基于Python实现的手写数字识别系统.zip （10个子文件）

shouxiemaster

实验报告.pptx 1.27MB

index.py 6KB

LICENSE 1KB

实验报告.docx 96KB

MNIST_data

t10k-images-idx3-ubyte.gz 1.57MB

train-labels-idx1-ubyte.gz 28KB

train-images-idx3-ubyte.gz 9.45MB

t10k-labels-idx1-ubyte.gz 4KB

README.md 11KB

OUTPUT.txt 735B

# 实验一手写数字识别 ## 1 实验目的 - 掌握卷积神经网络基本原理； - 掌握PyTorch（或其他框架）的基本用法以及构建卷积网络的基本操作； - 了解PyTorch（或其他框架）在GPU上的使用方法。 ## 2 实验要求 - 搭建PyTorch（或其他框架）环境； - 构建一个规范的卷积神经网络结构； - 在MNIST手写数字数据集上进行训练和评估，实现测试集准确率达到98%及以上； - 按规定时间在课程网站提交实验报告、代码以及PPT。 ## 3 实验原理（以PyTorch为例） ### 3.1PyTorch基本用法：使用 PyTorch, 必须了解PyTorch: 使用动态图（Eager Execution）而非旧版本TensorFlow的静态图（Graph Execution）来表示计算任务。使用张量（tensor）表示数据，可以说numpy的替代品。通过变量 (Variable) 维护状态，简单封装了 Tensor，并记录对 tensor 的操作记录用来构建计算图。Variable 主要包含三个属性： - data：保存 Variable 所包含的 tensor。 - grad：保存 data 对应的梯度，grad 也是 Variable，而非 tensor，它与 data 形状一致。 - grad_fn：指向一个 Function，记录 Variable 的操作历史，即它是什么操作的输出，用来构建计算图。 - autograd 包是 PyTorch 中所有神经网络的核心，该 autograd 软件包为 Tensors 上的所有操作提供自动微分。 torch.Tensor 是包的核心类。如果将其属性 .requires_grad 设置为 True，则会开始跟踪针对 tensor 的所有操作。完成计算后，可以调用 .backward() 来自动计算所有梯度。该张量的梯度将累积到 .grad 属性中。要停止 tensor 历史记录的跟踪，您可以调用 .detach()，它将其与计算历史记录分离，并防止将来的计算被跟踪。如果计算导数，可以调用 Tensor.backward()。神经网络可以通过 torch.nn 包来构建。神经网络基于自动梯度 (autograd)来定义一些模型。一个 nn.Module 包括层和一个方法 forward(input) ，它会返回输出(output)。 ### 3.2卷积神经网络：典型的卷积神经网络由卷积层、池化层、激活函数层交替组合构成，因此可将其视为一种层次模型，形象地体现了深度学习中“深度”之所在。 #### 卷积操作卷积运算是卷积神经网络的核心操作，给定二维的图像I作为输入，二维卷积核K，卷积运算可表示为： ![](https://www.writebug.com/myres/static/uploads/2021/10/26/88ee14d084699fdfaaeafbecaff67ccf.writebug) 给定5×5输入矩阵、3×3卷积核，相应的卷积操作如图1所示。 ![](https://www.writebug.com/myres/static/uploads/2021/10/26/fada0d5497f67613aab8c01bed4651d3.writebug) 图1 卷积运算在使用TensorFlow等深度学习框架时，卷积层会有padding参数，常用的有两种选择，一个是“valid”，一个是“same”。前者是不进行填充，后者则是进行数据填充并保证输出与输入具有相同尺寸。构建卷积或池化神经网络时，卷积步长也是一个很重要的基本参数。它控制了每个操作在特征图上的执行间隔。 #### 池化操作池化操作使用某位置相邻输出的总体统计特征作为该位置的输出，常用最大池化（max-pooling）和均值池化（average-pooling）。池化层不包含需要训练学习的参数，仅需指定池化操作的核大小、操作步长以及池化类型。池化操作示意如图2所示。 ![](https://www.writebug.com/myres/static/uploads/2021/10/26/c205d114db27c0d4d158aa4ece0232fa.writebug) 图2 池化操作 #### 激活函数层卷积操作可视为对输入数值进行线性计算发挥线性映射的作用。激活函数的引入，则增强了深度网络的非线性表达能力，从而提高了模型的学习能力。常用的激活函数有sigmoid、tanh和ReLU函数。 ## 4 实验所用工具及数据集（以PyTorch为例） - 工具 - PyCharm、PyTorch （下载地址及相关介绍：，） - 数据集 - MNIST手写数字数据集（下载地址及相关介绍：） ## 5 实验步骤与方法（以PyTorch为例） - 安装实验环境，包括PyCharm、PyTorch，如果使用GPU版本还需要安装cuda、cudnn； - 下载MNIST手写数字数据集； - 编辑代码 - 定义读取标签和图像函数 ```c++ # 1.1.Function to read label & image def _read(image, label): # minist_dir = os.path.dirname(__file__)+'/MNIST_data/' minist_dir = './MNIST_data/' with gzip.open(minist_dir + label) as flbl: magic, num = struct.unpack(">II", flbl.read(8)) label = np.fromstring(flbl.read(), dtype=np.int8) with gzip.open(minist_dir + image, 'rb') as fimg: magic, num, rows, cols = struct.unpack(">IIII", fimg.read(16)) image = np.fromstring(fimg.read(), dtype=np.uint8).reshape( len(label), rows, cols) return image, label # 1.2.Function to get data from .gz file def get_data(): train_img, train_label = _read( 'train-images-idx3-ubyte.gz', 'train-labels-idx1-ubyte.gz') test_img, test_label = _read( 't10k-images-idx3-ubyte.gz', 't10k-labels-idx1-ubyte.gz') return [train_img, train_label, test_img, test_label] ``` 定义LeNet5网络 ```c++ # 1.3.LeNet5 # 32-5+1=28,(28-2)/2+1=14,14-5+1=10,(10-2)/2+1=5,5-5+1=1, # 1*120 -> 84=7*12 -> 10 class LeNet5(nn.Module): def __init__(self): super(LeNet5, self).__init__() self.conv1 = nn.Conv2d(1, 6, 5, padding=2) self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(16 * 5 * 5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): = F.max_pool2d(torch.tanh(self.conv1(x)), (2, 2)) = F.dropout(x, p=0.3, training=self.training) = F.max_pool2d(torch.tanh(self.conv2(x)), (2, 2)) = F.dropout(x, p=0.3, training=self.training) = x.view(-1, self.num_flat_features(x)) # print('x.size:', x.size()) # [100, 400] = torch.tanh(self.fc1(x)) = F.dropout(x, p=0.3, training=self.training) = torch.tanh(self.fc2(x)) = F.dropout(x, p=0.3, training=self.training) = self.fc3(x) return x # Flatten the size of x (BATCH_SIZE*16*5*5 -> BATCH_SIZE*400) def num_flat_features(self, x): size = x.size()[1:] num_features = 1 for s in size: num_features *= s return num_features ``` 定义初始化参数函数 ```python # 1.4.Function to initialize parameters def weight_init(m): if isinstance(m, nn.Conv2d): = m.kernel_size[0] * m.kernel_size[1] * m.out_channels weight.data.normal_(0, math.sqrt(2. / n)) elif isinstance(m, nn.BatchNorm2d): weigth.data.fill_(1) bias.data.zero_() ``` 定义训练和测试函数 ```c++ # 1.5.Function to train network def train(epoch): model.train() for batch_idx, (data, target) in enumerate(train_loader): if use_gpu: data, target = data.cuda(), target.cuda() data, target = Variable(data), Variable(target.long()) optimizer.zero_grad() outputs = model(data) # print(data.shape, outputs.shape, target.shape) # [100, 1, 28, 28] [100, 10] [100] loss = criterion(outputs, target) loss.backward() optimizer.step() if (batch_idx+1) % 100 == 0: print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format( epoch, (batch_idx+1) * len(data), len(train_loader.dataset), * (batch_idx+1) / len(train_loader), loss.data)) # 1.6.Function to test network def test(): model.eval() test_loss = 0 correct = 0 for data, target in test_loader: if use_gpu: data, target = data.cuda(), target.cuda() with torch.no_grad(): data = Variable(data) # data = Variable(data, volatile=True) target = Variable(target.long()) outputs = model(data) test_loss += criterion(outputs, target).data pred = outputs.data.max(1, keepdim=True)[1] correct += pred.eq(target.data.view_as(pred)).cpu().sum() test_loss /= len(test_loader.dataset) print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)\n'.format( test_loss, correct, len(test_loader.dataset), * correct / len(test_loader.dataset))) ``` 定义一些常量参数 ```python # 2.1.Set some parameters # use_gpu = torch.cuda.is_available() use_gpu = False BATCH_SIZE = 100 kwargs = {'num_workers':

评论收藏

内容反馈

版权申诉