tensorflow迁移学习。运用google训练好的Inception-v3模型.zip资源-CSDN文库

共5个文件

xml：3个

py：1个

iml：1个

需积分: 5 172 浏览量 2023-08-01 21:34:36 上传评论 1 收藏 5KB ZIP 举报

在机器学习领域，迁移学习是一种高效的技术，它利用在大规模数据集上预训练的模型来解决新任务，而无需从零开始训练模型。本案例聚焦于TensorFlow框架下的迁移学习，特别是Google研发的Inception-v3模型。这个压缩包文件包含了一个已经训练好的Inception-v3模型，该模型在ImageNet数据集上进行了充分训练，拥有识别多种图像特征的强大能力。 Inception-v3是Google在2015年提出的一种深度卷积神经网络（CNN）架构，其设计目的是提高模型的准确性和计算效率。模型的名字来源于它的主要设计理念——“Inception模块”，这是一种多尺度并行处理的结构，能够同时考虑不同尺寸的特征，从而更全面地捕获图像信息。Inception-v3在减少计算复杂度的同时，提高了图像分类的性能，是当时最先进的模型之一。在TensorFlow中应用Inception-v3进行迁移学习，通常包括以下步骤： 1. 导入预训练模型：我们需要导入预训练的Inception-v3模型，并移除最后的全连接层，因为这部分是针对ImageNet数据集进行训练的，与我们的新任务可能不匹配。 2. 添加自定义层：接着，添加新的全连接层以适应我们的特定任务。这些层将根据新数据集的类别数量进行调整，并且是随机初始化的，它们会学习如何将Inception-v3提取的高级特征转换为新任务的预测。 3. 微调：在新数据集上进行有限的训练迭代，只更新新添加层的权重，有时也可以选择性地更新部分预训练模型的权重。这一步称为微调，可以使得模型更好地适应新任务，而不会遗忘之前学到的通用特征。 4. 训练与验证：通过交叉验证和调整超参数，确保模型在新数据集上的性能。在本案例中，模型达到了93%的正确率，这是一个相当高的准确度，表明迁移学习策略成功地应用于了新任务。 5. 模型评估与部署：对模型进行全面评估，包括计算精度、召回率、F1分数等指标，并将模型部署到实际应用中，如图像识别、智能监控等场景。这个压缩包提供的资源可以帮助开发者快速地利用Inception-v3模型进行迁移学习，节省了大量的训练时间和计算资源。通过调整和微调，可以将其应用到各种图像分类任务中，实现高效且精准的模型预测。对于那些没有大量标注数据或计算资源的项目，这种方法尤其有价值。在实践中，我们还需要关注模型泛化能力、过拟合等问题，以确保模型在未知数据上的表现。

资源推荐

资源详情

资源评论

收起资源包目录

tensorflow迁移学习。运用google训练好的Inception-v3模型。将一个数据集上训练好的卷积神经网络模型快速迁移到另外一个数据集上达到93%正确率。.zip （5个子文件）

today_0801

.idea

TransferLearning.iml 513B

misc.xml 206B

inspectionProfiles

profiles_settings.xml 228B

modules.xml 284B

TransferLearning

__init__.py 10KB

# -*- coding: utf-8 -*- import glob import os.path import random import numpy as np import tensorflow as tf from tensorflow.python.platform import gfile #Inception-v3模型瓶颈层的节点个数 BOTTENECK_TENSOR_SIZE = 2048 BOTTENECK_TENSOR_NAME='pool_3/_reshape:0' #图像输入张量对应名称 JPEG_DATA_TENSOR_NAME='DecodeJpeg/contents:0' #下载的谷歌训练好的Inception-v3模型文件目录 MODEL_DIR='/Users/yifanyang/Downloads/inception_dec_2015' #下载的谷歌训练好的Inception-v3模型文件名 MODEL_FILE='tensorflow_inception_graph.pb' #将模型计算得到的特征向量保存在文件中，免去重复计算 CACHE_DIR='/Users/yifanyang/Documents/Inception_v3' #图片数据文件夹，每一个子文件夹代表一个需要区分的类别 INPUT_DATA='/Users/yifanyang/Downloads/flower_photos' #验证的数据百分比 VALIDATION_PERCENTAGE = 10 #测试的数据百分比 TEST_PERCENTAGE=10 #定义神经网络的设置 LEARNING_RATE = 0.01 STEPS= 4000 BATCH = 100 # 3. 把样本中所有的图片列表并按训练、验证、测试数据分开 def create_image_lists(testing_percentage, validation_percentage): result = {} #获取当前目录下的所有子目录 sub_dirs = [x[0] for x in os.walk(INPUT_DATA)] is_root_dir = True for sub_dir in sub_dirs: if is_root_dir: is_root_dir = False continue #获取当前目录下的有效图片文件夹 extensions = ['jpg', 'jpeg', 'JPG', 'JPEG'] file_list = [] dir_name = os.path.basename(sub_dir) for extension in extensions: file_glob = os.path.join(INPUT_DATA, dir_name, '*.' + extension) file_list.extend(glob.glob(file_glob)) if not file_list: continue label_name = dir_name.lower() # 初始化当前训练集，测试数据集和验证数据集 training_images = [] testing_images = [] validation_images = [] for file_name in file_list: base_name = os.path.basename(file_name) # 随机划分数据到训练数据集，测试数据集和验证数据集 chance = np.random.randint(100) if chance < validation_percentage: validation_images.append(base_name) elif chance < (testing_percentage + validation_percentage): testing_images.append(base_name) else: training_images.append(base_name) result[label_name] = { 'dir': dir_name, 'training': training_images, 'testing': testing_images, 'validation': validation_images, } return result # 4. 定义函数通过类别名称、所属数据集和图片编号获取一张图片的地址 def get_image_path(image_lists, image_dir, label_name, index, category): label_lists = image_lists[label_name] category_list = label_lists[category] mod_index = index % len(category_list) base_name = category_list[mod_index] sub_dir = label_lists['dir'] full_path = os.path.join(image_dir, sub_dir, base_name) return full_path # 5. 定义函数获取Inception-v3模型处理之后的特征向量的文件地址 def get_bottleneck_path(image_lists, label_name, index, category): return get_image_path(image_lists, CACHE_DIR, label_name, index, category) + '.txt' # 6. 定义函数使用加载的训练好的Inception-v3模型处理一张图片，得到这个图片的特征向量 def run_bottleneck_on_image(sess, image_data, image_data_tensor, bottleneck_tensor): bottleneck_values = sess.run(bottleneck_tensor, {image_data_tensor: image_data}) bottleneck_values = np.squeeze(bottleneck_values) return bottleneck_values # 7. 定义函数会先试图寻找已经计算且保存下来的特征向量，如果找不到则先计算这个特征向量，然后保存到文件 def get_or_create_bottleneck(sess, image_lists, label_name, index, category, jpeg_data_tensor, bottleneck_tensor): label_lists = image_lists[label_name] sub_dir = label_lists['dir'] sub_dir_path = os.path.join(CACHE_DIR, sub_dir) if not os.path.exists(sub_dir_path): os.makedirs(sub_dir_path) bottleneck_path = get_bottleneck_path(image_lists, label_name, index, category) if not os.path.exists(bottleneck_path): image_path = get_image_path(image_lists, INPUT_DATA, label_name, index, category) image_data = gfile.FastGFile(image_path, 'rb').read() bottleneck_values = run_bottleneck_on_image(sess, image_data, jpeg_data_tensor, bottleneck_tensor) bottleneck_string = ','.join(str(x) for x in bottleneck_values) with open(bottleneck_path, 'w') as bottleneck_file: bottleneck_file.write(bottleneck_string) else: with open(bottleneck_path, 'r') as bottleneck_file: bottleneck_string = bottleneck_file.read() bottleneck_values = [float(x) for x in bottleneck_string.split(',')] return bottleneck_values # 8. 这个函数随机获取一个batch的图片作为训练数据 def get_random_cached_bottlenecks(sess, n_classes, image_lists, how_many, category, jpeg_data_tensor, bottleneck_tensor): bottlenecks = [] ground_truths = [] for _ in range(how_many): label_index = random.randrange(n_classes) label_name = list(image_lists.keys())[label_index] image_index = random.randrange(65536) bottleneck = get_or_create_bottleneck( sess, image_lists, label_name, image_index, category, jpeg_data_tensor, bottleneck_tensor) ground_truth = np.zeros(n_classes, dtype=np.float32) ground_truth[label_index] = 1.0 bottlenecks.append(bottleneck) ground_truths.append(ground_truth) return bottlenecks, ground_truths # 9. 这个函数获取全部的测试数据，并计算正确率 def get_test_bottlenecks(sess, image_lists, n_classes, jpeg_data_tensor, bottleneck_tensor): bottlenecks = [] ground_truths = [] label_name_list = list(image_lists.keys()) for label_index, label_name in enumerate(label_name_list): category = 'testing' for index, unused_base_name in enumerate(image_lists[label_name][category]): bottleneck = get_or_create_bottleneck(sess, image_lists, label_name, index, category,jpeg_data_tensor, bottleneck_tensor) ground_truth = np.zeros(n_classes, dtype=np.float32) ground_truth[label_index] = 1.0 bottlenecks.append(bottleneck) ground_truths.append(ground_truth) return bottlenecks, ground_truths # 10. 定义主函数 def main(): image_lists = create_image_lists(TEST_PERCENTAGE, VALIDATION_PERCENTAGE) n_classes = len(image_lists.keys()) # 读取已经训练好的Inception-v3模型。 with gfile.FastGFile(os.path.join(MODEL_DIR, MODEL_FILE), 'rb') as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) bottleneck_tensor, jpeg_data_tensor = tf.import_graph_def( graph_def, return_elements=[BOTTENECK_TENSOR_NAME, JPEG_DATA_TENSOR_NAME]) # 定义新的神经网络输入 bottleneck_input = tf.placeholder(tf.float32, [None, BOTTENECK_TENSOR_SIZE], name='BottleneckInputPlaceholder') ground_truth_input = tf.placeholder(tf.float32, [None, n_classes], name='GroundTruthInput') # 定义一层全链接层 with tf.name_scope('final_training_ops'): weights = tf.Variable(tf.truncated_normal([BOTTENECK_TENSOR_SIZE, n_classes], stddev=0.001)) biases = tf.Variable(tf.zeros([n_classes])) logits = tf.matmul(bottleneck_input, weights) + biases final_tensor = tf.nn.softmax(logits) # 定义交叉熵损失函数。 cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, ground_truth_input) cross_entropy_mean = tf.reduce_mean(cross_entropy) train_step = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy_mean) # 计算正确率�

评论收藏

内容反馈