机器学习-USTC-人工智能实验代码.zip资源-CSDN文库

共95个文件

py：42个

md：12个

index：10个

机器学习

需积分: 5 116 浏览量 2024-04-28 22:29:57 上传评论收藏 2.21MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

机器学习-USTC-人工智能实验代码.zip （95个子文件）

content

LICENSE 34KB

.idea

other.xml 233B

vcs.xml 180B

machinelearning

ocr_tensorflow_cnn_freetype_teacher

genIDCard.py 5KB

checkpoint 85B

tf_cnn_lstm_ctc.py 13KB

trainIDCard.py 6KB

.idea

ocr_tensorflow_cnn_freetype 3.iml 576B

other.xml 233B

workspace.xml 12KB

misc.xml 295B

modules.xml 310B

ocr.model-1200.meta 147KB

ocr.model-1200.index 525B

README.md 15KB

fonts

OCR-B.ttf 20KB

cnn_data.py 8KB

captchaCnn

__init__.py 1B

checkpoint 655B

0.96captcha.model-2200.index 1KB

cnn_train.py 6KB

0.99captcha.model-3700.meta 107KB

util.py 2KB

readMe 131B

0.97captcha.model-2700.index 1KB

0.95captcha.model-2000.index 1KB

0.97captcha.model-2700.meta 107KB

0.96captcha.model-2200.meta 107KB

captcha_cnn.py 1KB

0.95captcha.model-2000.meta 107KB

captcha_gen.py 2KB

0.98captcha.model-3500.index 1KB

ZXKm.jpg 7KB

1663.jpg 8KB

0.99captcha.model-3700.index 1KB

0.98captcha.model-3500.meta 107KB

README.md 252B

ocr_tensorflow_cnn_freetype

genIDCard.py 12KB

checkpoint 85B

ocr.model-7000.meta 139KB

tf_cnn_lstm_ctc.py 15KB

trainIDCard.py 6KB

ocr.model-7000.index 525B

ocr.model-6400.index 525B

ocr.model-7000.data-00000-of-00001 300KB

README.md 15KB

ocr.model-6400.data-00000-of-00001 300KB

fonts

OCR-B.ttf 20KB

ocr.model-6400.meta 139KB

captcha_recognize

checkpoint 103B

crack_capcha.model-1300.index 1KB

crack_capcha.model-1600.meta 107KB

crack_capcha.model-1600.index 1KB

gen_captcha.py 2KB

9Seu.jpg 8KB

crack_capcha.model-1300.meta 107KB

train_captcha.py 9KB

copyfromGit

tensorflow_tutorial-master

captchaCnn

__init__.py 1B

cnn_train.py 6KB

util.py 2KB

captcha_cnn.py 1KB

captcha_gen.py 2KB

README.md 223B

mnistGan

__init__.py 1B

mnist_gen.py 130B

mnist_train.py 115B

mnist_model.py 8KB

README.md 232B

mnistRnn

mnist_rnn.py 1KB

settings.py 320B

README.md 226B

rnn_train.py 4KB

poetryRnn

poetry.txt 2.56MB

poetry.py 2KB

poetry_model.py 7KB

poetry_train.py 130B

poetry_gen.py 155B

README.md 223B

README.md 684B

skipGramVec

text.py 3KB

model.py 4KB

gen.py 100B

train.py 102B

README.md 215B

avatarDcgan

avatar_gen.py 121B

avatar_model.py 11KB

avatar.py 2KB

avatar_train.py 123B

README.md 231B

skipThoughtVec

model.py 6KB

story.py 5KB

gen.py 2KB

train.py 2KB

README.md 235B

README.md 602B

最近在研究OCR识别相关的东西，最终目标是能识别身份证上的所有中文汉字+数字，不过本文先设定一个小目标，先识别定长为18的身份证号，当然本文的思路也是可以复用来识别定长的验证码识别的。本文实现思路主要来源于Xlvector的博客，采用基于CNN实现端到端的OCR，下面引用博文介绍目前基于深度学习的两种OCR识别方法： >1. 把OCR的问题当做一个多标签学习的问题。4个数字组成的验证码就相当于有4个标签的图片识别问题（这里的标签还是有序的），用CNN来解决。 >2. 把OCR的问题当做一个语音识别的问题，语音识别是把连续的音频转化为文本，验证码识别就是把连续的图片转化为文本，用CNN+LSTM+CTC来解决。这里方法1主要用来解决固定长度标签的图片识别问题，而方法2主要用来解决不定长度标签的图片识别问题，本文实现方法1识别固定18个数字字符的身份证号 ##环境依赖 1. 本文基于tensorflow框架实现,依赖于tensorflow环境，建议使用[anaconda](https://www.continuum.io/downloads)进行python包管理及环境管理 2. 本文使用freetype-py 进行训练集图片的实时生成，同时后续也可扩展为能生成中文字符图片的训练集，建议使用pip安装 ```shell pip install freetype-py ``` 3. 同时本文还依赖于numpy和opencv等常用库 ```shell pip install numpy cv2 ``` ##知识准备 1. 本文不具体介绍CNN (卷积神经网络)具体实现原理，不熟悉的建议参看集智博文[卷积：如何成为一个很厉害的神经网络](https://jizhi.im/blog/post/intuitive_explanation_cnn)，这篇文章写得很👍 2. 本文实现思路很容易理解，就是把一个有序排列18个数字组成的图片当做一个多标签学习的问题，标签的长度可以任意改变，只要是固定长度的，这个训练方法都是适用的，当然现实中很多情况是需要识别不定长度的标签的，这部分就需要使用方法2(CNN+lSTM+CTC)来解决了。 ##正文 ###训练数据集生成首先先完成训练数据集图片的生成，主要依赖于freetype-py库生成数字/中文的图片。**其中要注意的一点是就是生成图片的大小，本文经过多次尝试后，生成的图片是32 x 256大小的，如果图片太大，则可能导致训练不收敛** 生成出来的示例图片如下： ![image.png](http://upload-images.jianshu.io/upload_images/1938615-7e6a72a05784feb6.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240) gen_image()方法返回 image_data：图片像素数据 (32,256) label：图片标签 18位数字字符 477081933151463759 vec : 图片标签转成向量表示 (180,) 代表每个数字所处的列，总长度 18 * 10 ```python #!/usr/bin/env python2 # -*- coding: utf-8 -*- """ 身份证文字+数字生成类 @author: pengyuanjie """ import numpy as np import freetype import copy import random import cv2 class put_chinese_text(object): def __init__(self, ttf): self._face = freetype.Face(ttf) def draw_text(self, image, pos, text, text_size, text_color): ''' draw chinese(or not) text with ttf :param image: image(numpy.ndarray) to draw text :param pos: where to draw text :param text: the context, for chinese should be unicode type :param text_size: text size :param text_color:text color :return: image ''' self._face.set_char_size(text_size * 64) metrics = self._face.size ascender = metrics.ascender/64.0 #descender = metrics.descender/64.0 #height = metrics.height/64.0 #linegap = height - ascender + descender ypos = int(ascender) if not isinstance(text, unicode): text = text.decode('utf-8') img = self.draw_string(image, pos[0], pos[1]+ypos, text, text_color) return img def draw_string(self, img, x_pos, y_pos, text, color): ''' draw string :param x_pos: text x-postion on img :param y_pos: text y-postion on img :param text: text (unicode) :param color: text color :return: image ''' prev_char = 0 pen = freetype.Vector() pen.x = x_pos << 6 # div 64 pen.y = y_pos << 6 hscale = 1.0 matrix = freetype.Matrix(int(hscale)*0x10000L, int(0.2*0x10000L),\ int(0.0*0x10000L), int(1.1*0x10000L)) cur_pen = freetype.Vector() pen_translate = freetype.Vector() image = copy.deepcopy(img) for cur_char in text: self._face.set_transform(matrix, pen_translate) self._face.load_char(cur_char) kerning = self._face.get_kerning(prev_char, cur_char) pen.x += kerning.x slot = self._face.glyph bitmap = slot.bitmap cur_pen.x = pen.x cur_pen.y = pen.y - slot.bitmap_top * 64 self.draw_ft_bitmap(image, bitmap, cur_pen, color) pen.x += slot.advance.x prev_char = cur_char return image def draw_ft_bitmap(self, img, bitmap, pen, color): ''' draw each char :param bitmap: bitmap :param pen: pen :param color: pen color e.g.(0,0,255) - red :return: image ''' x_pos = pen.x >> 6 y_pos = pen.y >> 6 cols = bitmap.width rows = bitmap.rows glyph_pixels = bitmap.buffer for row in range(rows): for col in range(cols): if glyph_pixels[row*cols + col] != 0: img[y_pos + row][x_pos + col][0] = color[0] img[y_pos + row][x_pos + col][1] = color[1] img[y_pos + row][x_pos + col][2] = color[2] class gen_id_card(object): def __init__(self): #self.words = open('AllWords.txt', 'r').read().split(' ') self.number = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] self.char_set = self.number #self.char_set = self.words + self.number self.len = len(self.char_set) self.max_size = 18 self.ft = put_chinese_text('fonts/OCR-B.ttf') #随机生成字串，长度固定 #返回text,及对应的向量 def random_text(self): text = '' vecs = np.zeros((self.max_size * self.len)) #size = random.randint(1, self.max_size) size = self.max_size for i in range(size): c = random.choice(self.char_set) vec = self.char2vec(c) text = text + c vecs[i*self.len:(i+1)*self.len] = np.copy(vec) return text,vecs #根据生成的text，生成image,返回标签和图片元素数据 def gen_image(self): text,vec = self.random_text() img = np.zeros([32,256,3]) color_ = (255,255,255) # Write pos = (0, 0) text_size = 21 image = self.ft.draw_text(img, pos, text, text_size, color_) #仅返回单通道值，颜色对于汉字识别没有什么意义 return image[:,:,2],text,vec #单字转向量 def char2vec(self, c): vec = np.zeros((self.len)) for j in range(self.len): if self.char_set[j] == c: vec[j] = 1 return vec #向量转文本 def vec2text(self, vecs): text = '' v_len = len(vecs) for i in range(v_len): if(vecs[i] == 1): text = text + self.char_set[i % self.len] return text if __name__ == '__main__': genObj = gen_id_card() image_data,label,vec = genObj.gen_image() cv2.imshow('image', image_data) cv2.waitKey(0) ``` ###构建网络，开始训练首先定义生成一个batch的方法： ```python # 生成一个训练batch def get_next_batch(batch_size=1

评论收藏

内容反馈