rcnn网络tensorflow实现资源-CSDN文库

共53个文件

sample：11个

jpg：9个

py：8个

crnn

tensor

3星 · 超过75%的资源需积分: 50 99 浏览量 2019-03-17 00:00:15 上传评论 9 收藏 1.1MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

OCR_TF_CRNN_CTC.zip （53个子文件）

OCR_TF_CRNN_CTC

.git

info

exclude 240B

objects

pack

pack-c748914bdd364016b638e7c84f4f072baef1d8b8.idx 14KB

pack-c748914bdd364016b638e7c84f4f072baef1d8b8.pack 833KB

info

HEAD 23B

description 73B

packed-refs 46B

config 327B

index 2KB

refs

tags

remotes

origin

master 41B

heads

master 41B

COMMIT_EDITMSG 18B

hooks

commit-msg.sample 896B

pre-receive.sample 544B

fsmonitor-watchman.sample 3KB

pre-rebase.sample 5KB

prepare-commit-msg.sample 1KB

update.sample 4KB

pre-push.sample 1KB

pre-commit.sample 2KB

post-update.sample 189B

applypatch-msg.sample 478B

pre-applypatch.sample 424B

logs

HEAD 217B

refs

remotes

origin

master 169B

heads

master 217B

requirements.txt 61B

data

20180919202432.png 88KB

20180919022202.png 98KB

20180919202451.png 100KB

dowload_synth90k_and_create_tfrecord.sh 386B

create_synth90k_tfrecord.py 4KB

char_map

char_map1.json 780B

char_map.json 425B

crnn_model

__init__.py 1B

model.py 4KB

.DS_Store 8KB

tools

create_crnn_ctc_tfrecord.py 4KB

train_crnn_ctc.py 10KB

eval_crnn_ctc.py 8KB

inference_crnn_ctc.py 5KB

README.md 5KB

test_data

images

1_AFTERSHAVE_1509.jpg 2KB

3_REINFECTION_64188.jpg 2KB

5_Rousted_66822.jpg 1KB

2_LARIAT_43420.jpg 949B

9_HORSETRADING_36909.jpg 3KB

4_CONJUGATION_16114.jpg 3KB

8_Shortages_70419.jpg 2KB

6_Tangibility_77430.jpg 1KB

7_Commercializing_15217.jpg 1KB

.DS_Store 6KB

image_list.txt 208B

labelTest.py 213B

# OCR_TF_CRNN_CTC This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC loss for image-based sequence recognition tasks, such as scene text recognition and OCR. "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition" : https://arxiv.org/abs/1507.05717 More details for CRNN and CTC loss (in chinese): https://zhuanlan.zhihu.com/p/43534801 # Dependencies All dependencies should be installed are as follow: * tensorflow==1.8.0 * opencv-python * numpy Required packages can be installed with ```bash pip install -r requirements.txt ``` Note: This software cannot run in the tensorflow lastest version r1.11.0 since it's modified the tf.contrib.rnn API. # Run demo Asume your current work directory is OCR_TF_CRNN_CTC： ```bash cd path/to/your/OCR_TF_CRNN_CTC/ ``` Dowload pretrained model and extract it to your disc: [GoogleDrive](https://drive.google.com/file/d/1A3V7o3SKSiL3IHcTqc1jP4w58DuC8F9o/view?usp=sharing) . Export current work directory path into PYTHONPATH: ```bash export PYTHONPATH=$PYTHONPATH:./ ``` Run inference demo: ```bash python tools/inference_crnn_ctc.py \ --image_dir ./test_data/images/ --image_list ./test_data/image_list.txt \ --model_dir /path/to/your/bs_synth90k_model/ ``` Result is: ``` Predict 1_AFTERSHAVE_1509.jpg image as: aftershave ``` ![1_AFTERSHAVE_1509.jpg](https://github.com/bai-shang/CRNN_CTC_Tensorflow/blob/master/test_data/images/1_AFTERSHAVE_1509.jpg?raw=true) ``` Predict 2_LARIAT_43420.jpg image as: lariat ``` ![2_LARIAT_43420](https://github.com/bai-shang/CRNN_CTC_Tensorflow/blob/master/test_data/images/2_LARIAT_43420.jpg?raw=true) # Train a new model ### Data Preparation * Firstly you need to download [Synth90k](http://www.robots.ox.ac.uk/~vgg/data/text/) dataset and extract it into a folder. * Secondly supply a txt file to specify the relative path to the image data dir and it's corresponding text label. For example: image_list.txt ```bash 90kDICT32px/1/2/373_coley_14845.jpg coley 90kDICT32px/17/5/176_Nevadans_51437.jpg nevadans ``` * Then you are suppose to convert your dataset into tensorflow records which can be done by ```bash python tools/create_crnn_ctc_tfrecord.py \ --image_dir path/to/90kDICT32px/ --anno_file path/to/image_list.txt --data_dir ./tfrecords/ \ --validation_split_fraction 0.1 ``` Note: make sure that images can be read from the path you specificed, such as: ```bash path/to/90kDICT32px/1/2/373_coley_14845.jpg path/to/90kDICT32px/17/5/176_Nevadans_51437.jpg ....... ``` All training image will be scaled into height 32 and write to tfrecord file. The dataset will be divided into train and validation set and you can change the parameter to control the ratio of them. #### Otherwise you can use the dowload_synth90k_and_create_tfrecord.sh script automatically create tfrecord: ``` cd ./data sh dowload_synth90k_and_create_tfrecord.sh ``` ### Train model ```bash python tools/train_crnn_ctc.py --data_dir ./tfrecords/ --model_dir ./model/ --batch_size 32 ``` After several times of iteration you can check the output in terminal as follow: ![](https://github.com/bai-shang/CRNN_CTC_Tensorflow/blob/master/data/20180919022202.png?raw=true) During my experiment the loss drops as follow: ![](https://github.com/bai-shang/CRNN_CTC_Tensorflow/blob/master/data/20180919202432.png?raw=true) ### Evaluate model ```bash python tools/eval_crnn_ctc.py --data_dir ./tfrecords/ --model_dir ./model/ ``` 为了将特征输入到Recurrent Layers，做如下处理：首先会将图像缩放到 32*W*3 大小然后经过CNN后变为 1* (W/4)*512 接着针对LSTM，设置 T=(W/4) ， D=512 ，即可将特征输入LSTM。所以在处理输入图像的时候，建议在保持长宽比的情况下将高缩放到 32，这样能够尽量不破坏图像中的文本细节。当然也，也可以将输入图像缩放到固定宽度，但是这样肯定会造成性能下降。字符转义，"\"" : vlaue. warning: tensorflow/core/util/ctc/ctc_loss_calculator.cc:144] No valid path found. It turns out that the ctc_loss requires that the label lengths be shorter than the input lengths. If the label lengths are too long, the loss calculator cannot unroll completely and therefore cannot compute the los. 输入的序列长度必须 >= label 的长度，否则无法计算 CTC loss，换句话说，识别出的字符长度可以少于输入的序列长度但是不能比它长。需要在char_map/char_mao.json 中添加英文或中文符号，添加格式"&" : 56

评论收藏

内容反馈