手工实现相关机器学习算法.zip资源-CSDN文库

共16个文件

py：7个

xml：4个

gz：4个

需积分: 5 57 浏览量 2024-04-17 19:45:28 上传评论收藏 11.08MB ZIP 举报

机器学习是一种人工智能（AI）的子领域，致力于研究如何利用数据和算法让计算机系统具备学习能力，从而能够自动地完成特定任务或者改进自身性能。机器学习的核心思想是让计算机系统通过学习数据中的模式和规律来实现目标，而不需要显式地编程。机器学习应用非常广泛，包括但不限于以下领域：图像识别和计算机视觉：机器学习在图像识别、目标检测、人脸识别、图像分割等方面有着广泛的应用。例如，通过深度学习技术，可以训练神经网络来识别图像中的对象、人脸或者场景，用于智能监控、自动驾驶、医学影像分析等领域。自然语言处理：机器学习在自然语言处理领域有着重要的应用，包括文本分类、情感分析、机器翻译、语音识别等。例如，通过深度学习模型，可以训练神经网络来理解和生成自然语言，用于智能客服、智能助手、机器翻译等场景。推荐系统：推荐系统利用机器学习算法分析用户的行为和偏好，为用户推荐个性化的产品或服务。例如，电商网站可以利用机器学习算法分析用户的购买历史和浏览行为，向用户推荐感兴趣的商品。预测和预测分析：机器学习可以用于预测未来事件的发生概率或者趋势。例如，金融领域可以利用机器学习算法进行股票价格预测、信用评分、欺诈检测等。医疗诊断和生物信息学：机器学习在医疗诊断、药物研发、基因组学等领域有着重要的应用。例如，可以利用机器学习算法分析医学影像数据进行疾病诊断，或者利用机器学习算法分析基因数据进行疾病风险预测。智能交通和物联网：机器学习可以应用于智能交通系统、智能城市管理和物联网等领域。例如，可以利用机器学习算法分析交通数据优化交通流量，或者利用机器学习算法分析传感器数据监测设备状态。以上仅是机器学习应用的一部分，随着机器学习技术的不断发展和应用场景的不断拓展，机器学习在各个领域都有着重要的应用价值，并且正在改变我们的生活和工作方式。

资源推荐

资源详情

资源评论

收起资源包目录

手工实现相关机器学习算法.zip （16个子文件）

content

使用梯度下降法线性回归.py 2KB

MNIST

t10k-images-idx3-ubyte.gz 1.57MB

train-labels-idx1-ubyte.gz 28KB

train-images-idx3-ubyte.gz 9.45MB

t10k-labels-idx1-ubyte.gz 4KB

多元线性回归.py 942B

.idea

vcs.xml 180B

workspace.xml 11KB

misc.xml 288B

machine_learn.iml 453B

modules.xml 278B

MyKNNClassifier.py 2KB

搭建手写数字识别卷积神经网络（正向传播）.py 25KB

实现两层神经网络反向传播优化.py 8KB

实现两层神经网络.py 7KB

MySimpleLinearRegression.py 1KB

import gzip import os import pickle from collections import OrderedDict import numpy as np import matplotlib.pyplot as plt def hotone(y): re = np.zeros((y.shape[0], 10)) for i in range(y.shape[0]): re[i, y[i]] = 1 return re def load_data(normalize=True, flat=True, hot_one=True): base_path = os.path.realpath('') base_path = base_path + '/MNIST/' fnames = ['t10k-images-idx3-ubyte.gz', 't10k-labels-idx1-ubyte.gz', 'train-images-idx3-ubyte.gz', 'train-labels-idx1-ubyte.gz'] flist = [] for f in fnames: fpath = base_path + f flist.append(fpath) with gzip.open(flist[0], 'rb') as f: test_x = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1, 784) with gzip.open(flist[1], 'rb') as f: test_y = np.frombuffer(f.read(), np.uint8, offset=8) with gzip.open(flist[2], 'rb') as f: train_x = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1, 784) with gzip.open(flist[3], 'rb') as f: train_y = np.frombuffer(f.read(), np.uint8, offset=8) # print(test_x.shape, test_y.shape) # print(train_x.shape,train_y.shape) if hot_one: test_y = hotone(test_y) train_y = hotone(train_y) if normalize: train_x = train_x.astype(np.float32) test_x = test_x.astype(np.float32) train_x = train_x / 255.0 test_x = test_x / 255.0 if flat == False: train_x = train_x.reshape(-1, 1, 28, 28) test_x = test_x.reshape(-1, 1, 28, 28) return (train_x, train_y), (test_x, test_y) def numerical_gradient(f, x): h = 1e-4 # 0.0001 grad = np.zeros_like(x) # 默认情况下，nditer将视待迭代遍历的数组为只读对象（read-only），为了在遍历数组的同时，实现对数组元素值得修改， # 必须指定op_flags=['readwrite']模式： # flags=['multi_index'] # 多维迭代 it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite']) while not it.finished: idx = it.multi_index tmp_val = x[idx] x[idx] = float(tmp_val) + h fxh1 = f(x) # f(x+h) x[idx] = tmp_val - h fxh2 = f(x) # f(x-h) grad[idx] = (fxh1 - fxh2) / (2 * h) x[idx] = tmp_val # 还原值 it.iternext() # 遍历下一条 return grad def cross_entropy_error(y, t): if y.ndim == 1: t = t.reshape(1, t.size) y = y.reshape(1, y.size) # 监督数据是one-hot-vector的情况下，转换为正确解标签的索引 if t.size == y.size: t = t.argmax(axis=1) batch_size = y.shape[0] return -np.sum(np.log(y[np.arange(batch_size), t] + 1e-7)) / batch_size def softmax(x): if x.ndim == 2: x = x.T x = x - np.max(x, axis=0) y = np.exp(x) / np.sum(np.exp(x), axis=0) return y.T x = x - np.max(x) # 溢出对策 return np.exp(x) / np.sum(np.exp(x)) class Relu: def __init__(self): self.mask = None def forward(self, x): self.mask = (x <= 0) out = x.copy() out[self.mask] = 0 return out def backward(self, dout): dout[self.mask] = 0 dx = dout return dx def sigmoid(x): return 1 / (1 + np.exp(-x)) class Sigmoid: def __init__(self): self.out = None def forward(self, x): out = sigmoid(x) self.out = out return out def backward(self, dout): dx = dout * (1.0 - self.out) * self.out return dx class Affine: def __init__(self, W, b): self.W = W self.b = b self.x = None self.original_x_shape = None # 权重和偏置参数的导数 self.dW = None self.db = None def forward(self, x): # 对应张量 self.original_x_shape = x.shape x = x.reshape(x.shape[0], -1) self.x = x out = np.dot(self.x, self.W) + self.b return out def backward(self, dout): dx = np.dot(dout, self.W.T) self.dW = np.dot(self.x.T, dout) self.db = np.sum(dout, axis=0) dx = dx.reshape(*self.original_x_shape) # 还原输入数据的形状（对应张量） return dx class SoftmaxWithLoss: def __init__(self): self.loss = None self.y = None # softmax的输出 self.t = None # 监督数据 def forward(self, x, t): self.t = t self.y = softmax(x) self.loss = cross_entropy_error(self.y, self.t) return self.loss def backward(self, dout=1): batch_size = self.t.shape[0] if self.t.size == self.y.size: # 监督数据是one-hot-vector的情况 dx = (self.y - self.t) / batch_size else: dx = self.y.copy() dx[np.arange(batch_size), self.t] -= 1 dx = dx / batch_size return dx class Dropout: def __init__(self, dropout_ratio=0.5): self.dropout_ratio = dropout_ratio self.mask = None def forward(self, x, train_flg=True): if train_flg: self.mask = np.random.rand(*x.shape) > self.dropout_ratio return x * self.mask else: return x * (1.0 - self.dropout_ratio) def backward(self, dout): return dout * self.mask class BatchNormalization: def __init__(self, gamma, beta, momentum=0.9, running_mean=None, running_var=None): self.gamma = gamma self.beta = beta self.momentum = momentum self.input_shape = None # Conv层的情况下为4维，全连接层的情况下为2维 # 测试时使用的平均值和方差 self.running_mean = running_mean self.running_var = running_var # backward时使用的中间数据 self.batch_size = None self.xc = None self.std = None self.dgamma = None self.dbeta = None def forward(self, x, train_flg=True): self.input_shape = x.shape if x.ndim != 2: N, C, H, W = x.shape x = x.reshape(N, -1) out = self.__forward(x, train_flg) return out.reshape(*self.input_shape) def __forward(self, x, train_flg): if self.running_mean is None: N, D = x.shape self.running_mean = np.zeros(D) self.running_var = np.zeros(D) if train_flg: mu = x.mean(axis=0) xc = x - mu var = np.mean(xc ** 2, axis=0) std = np.sqrt(var + 10e-7) xn = xc / std self.batch_size = x.shape[0] self.xc = xc self.xn = xn self.std = std self.running_mean = self.momentum * self.running_mean + (1 - self.momentum) * mu self.running_var = self.momentum * self.running_var + (1 - self.momentum) * var else: xc = x - self.running_mean xn = xc / ((np.sqrt(self.running_var + 10e-7))) out = self.gamma * xn + self.beta return out def backward(self, dout): if dout.ndim != 2: N, C, H, W = dout.shape dout = dout.reshape(N, -1) dx = self.__backward(dout) dx = dx.reshape(*self.input_shape) return dx def __backward(self, dout): dbeta = dout.sum(axis=0) dgamma = np.sum(self.xn * dout, axis=0) dxn = self.gamma * dout dxc = dxn / self.std dstd = -np.sum((dxn * self.xc) / (self.std * self.std), axis=0) dvar = 0.5 * dstd / self.std dxc += (2.0 / self.batch_size) * self.xc * dvar dmu = np.sum(dxc, axis=0) dx = dxc - dmu / self.batch_size self.dgamma = dgamma self.dbeta = dbeta return dx class SGD: """随机梯度下降法（Stochastic Gradient Descent）""" def __init__(self, lr=0.01): self.lr = lr def update(self, params, grads): for key in params.keys(): params[key] -= self.lr * grads[key] class Moment

评论收藏

内容反馈