Fork from [https://github.com/eragonruan/text-detection-ctpn.git](https://github.com/eragonruan/text-detection-ctpn.git)
Add for CPU ,if you want apply to GPU ,git clone https://github.com/eragonruan/text-detection-ctpn.git
# text-detection-ctpn
text detection mainly based on ctpn (connectionist text proposal network). It is implemented in tensorflow. I use id card detect as an example to demonstrate the results, but it should be noticing that this model can be used in almost every horizontal scene text detection task. The origin paper can be found [here](https://arxiv.org/abs/1609.03605). Also, the origin repo in caffe can be found in [here](https://github.com/tianzhi0549/CTPN). This repo is mainly based on faster rcnn framework, so there remains tons of useless code. I'm still working on it. For more detail about the paper and code, see this [blog](http://slade-ruan.me/2017/10/22/text-detection-ctpn/)
***
# setup
- requirements: tensorflow1.3, cython0.24, opencv-python, easydict,(recommend to install Anaconda)
- build the library
```shell
cd lib/utils
chmod +x make.sh
./make.sh
```
***
# parameters
there are some parameters you may need to modify according to your requirement, you can find them in ctpn/text.yml
- USE_GPU_NMS # whether to use nms implemented in cuda,if you do not have a gpu device,follow here to [setup](https://github.com/eragonruan/text-detection-ctpn/issues/43)
- DETECT_MODE # H represents horizontal mode, O represents oriented mode, default is H
***
# demo
put your images in data/demo, the results will be saved in data/results, and run demo in the root
```shell
python ./ctpn/demo.py
```
***
# training
## prepare data
- First, download the pre-trained model of VGG net and put it in data/pretrain/VGG_imagenet.npy. you can download it from [google drive](https://drive.google.com/open?id=0B_WmJoEtfQhDRl82b1dJTjB2ZGc) or [baidu yun](https://pan.baidu.com/s/1kUNTl1l).
- Second, prepare the training data as referred in paper, or you can download the data I prepared from previous link. Or you can prepare your own data according to the following steps.
- Modify the path and gt_path in prepare_training_data/split_label.py according to your dataset. And run
```shell
cd prepare_training_data
python split_label.py
```
- it will generate the prepared data in current folder, and then run
```shell
python ToVoc.py
```
- to convert the prepared training data into voc format. It will generate a folder named TEXTVOC. move this folder to data/ and then run
```shell
cd ../data
ln -s TEXTVOC VOCdevkit2007
```
## train
Simplely run
```shell
python ./ctpn/train_net.py
```
- you can modify some hyper parameters in ctpn/text.yml, or just used the parameters I set.
- The model I provided in checkpoints is trained on GTX1070 for 50k iters.
- If you are using cuda nms, it takes about 0.2s per iter. So it will takes about 2.5 hours to finished 50k iterations.
***
# roadmap
- [x] cython nms
- [x] cuda nms
- [x] python2/python3 compatblity
- [x] tensorflow1.3
- [x] delete useless code
- [x] loss function as referred in paper
- [x] oriented text connector
- [ ] side refinement
- [ ] model optimization
***
# some results
`NOTICE:` all the photos used below are collected from the internet. If it affects you, please contact me to delete them.
<img src="/data/results/002.jpg" width=320 height=240 /><img src="/data/results/003.jpg" width=320 height=240 />
<img src="/data/results/009.jpg" width=320 height=480 /><img src="/data/results/010.png" width=320 height=320 />
<img src="/data/results/IMG_0708.png" width=320 height=480 /><img src="/data/results/CgREFFmZV8uAde7ZAABbuGILFDY720.jpg" width=320 height=480 />
<img src="/data/results/car.jpg" width=320 height=480 /><img src="/data/results/CgREFFmX_VWAXK-sAAGOUUyUl5Q448.jpg" width=320 height=480 />
***
# comparison of horizontal and oriented text connector
- oriented text connector has been implemented, i's working, but still need futher improvement.
- left figure is the result for DETECT_MODE H, right figure for DETECT_MODE O
<img src="/data/results/007.jpg" width=320 height=240 /><img src="/data/oriented_results/007.jpg" width=320 height=240 />
<img src="/data/results/008.jpg" width=320 height=480 /><img src="/data/oriented_results/008.jpg" width=320 height=480 />
***
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
【资源说明】 基于OPENCV和tesseract的中文扫描票据OCR识别。文档+源码+全部资料+优秀项目.zip 【备注】 1、该项目是个人高分项目源码,已获导师指导认可通过,答辩评审分达到95分 2、该资源内项目代码都经过测试运行成功,功能ok的情况下才上传的,请放心下载使用! 3、本项目适合计算机相关专业(人工智能、通信工程、自动化、电子信息、物联网等)的在校学生、老师或者企业员工下载使用,也可作为毕业设计、课程设计、作业、项目初期立项演示等,当然也适合小白学习进阶。 4、如果基础还行,可以在此代码基础上进行修改,以实现其他功能,也可直接用于毕设、课设、作业等。 欢迎下载,沟通交流,互相学习,共同进步!OpenCV
资源推荐
资源详情
资源评论
收起资源包目录
基于OPENCV和tesseract的中文扫描票据OCR识别。文档+源码+全部资料+优秀项目.zip (144个子文件)
BUILD 2KB
cython_nms.c 361KB
bbox.c 322KB
gpu_nms.c 288KB
checkpoint 123B
gpu_nms.cpp 285KB
nms_kernel.cu 5KB
VGGnet_fast_rcnn_iter_150000.ckpt.data-00000-of-00001 66.76MB
.gitignore 99B
.gitignore 78B
mnist.pkl.gz 16.26MB
train-images-idx3-ubyte.gz 9.45MB
t10k-images-idx3-ubyte.gz 1.57MB
train-labels-idx1-ubyte.gz 28KB
t10k-labels-idx1-ubyte.gz 4KB
gpu_nms.hpp 146B
untitled.iml 520B
OCR v1.0.iml 474B
VGGnet_fast_rcnn_iter_150000.ckpt.index 2KB
LICENSE 1KB
README.md 4KB
README.md 4KB
README.md 3KB
ReadMe (1).md 558B
ReadMe.md 558B
VGGnet_fast_rcnn_iter_150000.ckpt.meta 383KB
VGG_imagenet.npy 58B
扫描0002.pdf 339KB
扫描0007.pdf 290KB
扫描0004.pdf 287KB
扫描0001.pdf 237KB
扫描0003.pdf 138KB
扫描0005.pdf 133KB
扫描0006.pdf 102KB
IMG_0708.png 5.92MB
IMG_0708.png 5.92MB
IMG_0708.png 1.81MB
test.png 1.13MB
010.png 67KB
010.png 67KB
010.png 13KB
network.py 18KB
anchor_target_layer_tf.py 13KB
pascal_voc.py 10KB
config.py 10KB
train.py 9KB
fully_connected_feed.py 9KB
minibatch.py 8KB
input_data.py 7KB
proposal_layer_tf.py 7KB
outputXml.py 7KB
ToVoc.py 7KB
demoDir.py 7KB
pil_lib.py 7KB
demoFile.py 6KB
network.py 6KB
sample1_deep_learn.py 6KB
mnist.py 6KB
roidb.py 6KB
imdb.py 5KB
text_proposal_connector_oriented.py 4KB
setup_cpu.py 4KB
setup.py 4KB
split_label.py 4KB
mnist_with_summaries.py 4KB
mnist_loader.py 3KB
VGGnet_train.py 3KB
demo.py 3KB
text_proposal_graph_builder.py 3KB
boxes_grid.py 3KB
bbox_transform.py 3KB
apiTest.py 3KB
layer.py 3KB
mnist_deep.py 2KB
mnist_advance.py 2KB
fileOcrTrement.py 2KB
text_proposal_connector.py 2KB
VGGnet_test.py 2KB
demoDir_py2.py 2KB
demoFile_py2.py 2KB
detectors.py 2KB
ocrOutput.py 2KB
ocrOutput.py 2KB
threshold_method.py 2KB
mnist_softmax.py 2KB
test.py 2KB
opencv_test.py 2KB
webScraping.py 2KB
mnist_test.py 1KB
blob.py 1KB
train_net.py 1KB
generate_anchors.py 1KB
ds_utils.py 1KB
picTreatment.py 1023B
tesseractocr.py 906B
ticketocr.py 878B
zengzhishuiTemplate.py 855B
zengzhishuiTemplate.py 855B
smoothing_method.py 841B
factory.py 841B
共 144 条
- 1
- 2
资源评论
Yuki-^_^
- 粉丝: 3101
- 资源: 2256
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- CFA知识点梳理系列:CFA Level II, Reading 4 Big Data Projects
- 专业问题 · 语雀.mhtml
- 基于Vue+TP6的B2B2C多场景电商商城设计源码
- 基于小程序的研知识题库小程序源代码(java+小程序+mysql).zip
- 基于小程序的微信小程序的点餐系统源代码(java+小程序+mysql).zip
- 基于小程序的宿舍管理小程序源代码(java+小程序+mysql).zip
- 基于小程序的小区服务系统源代码(python+小程序+mysql).zip
- QT项目之中国象棋人工智能
- 基于小程序的疫情核酸预约小程序源代码(java+小程序+mysql).zip
- 基于小程序的生活小助手源代码(java+小程序+mysql).zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功