# text-detection-ctpn
text detection mainly based on ctpn (connectionist text proposal network). It is implemented in tensorflow. I use id card detect as an example to demonstrate the results, but it should be noticing that this model can be used in almost every horizontal scene text detection task. The origin paper can be found [here](https://arxiv.org/abs/1609.03605). Also, the origin repo in caffe can be found in [here](https://github.com/tianzhi0549/CTPN). For more detail about the paper and code, see this [blog](http://slade-ruan.me/2017/10/22/text-detection-ctpn/)
***
# setup
- requirements: tensorflow1.3, cython0.24, opencv-python, easydict,(recommend to install Anaconda)
- if you do not have a gpu device,follow here to [setup](https://github.com/eragonruan/text-detection-ctpn/issues/43)
- if you have a gpu device, build the library by
```shell
cd lib/utils
chmod +x make.sh
./make.sh
```
***
# parameters
there are some parameters you may need to modify according to your requirement, you can find them in ctpn/text.yml
- USE_GPU_NMS # whether to use nms implemented in cuda or not
- DETECT_MODE # H represents horizontal mode, O represents oriented mode, default is H
- checkpoints_path # the model I provided is in checkpoints/, if you train the model by yourself,it will be saved in output/
***
# demo
- put your images in data/demo, the results will be saved in data/results, and run demo in the root
```shell
python ./ctpn/demo.py
```
***
# training
## prepare data
- First, download the pre-trained model of VGG net and put it in data/pretrain/VGG_imagenet.npy. you can download it from [google drive](https://drive.google.com/open?id=0B_WmJoEtfQhDRl82b1dJTjB2ZGc) or [baidu yun](https://pan.baidu.com/s/1kUNTl1l).
- Second, prepare the training data as referred in paper, or you can download the data I prepared from previous link. Or you can prepare your own data according to the following steps.
- Modify the path and gt_path in prepare_training_data/split_label.py according to your dataset. And run
```shell
cd prepare_training_data
python split_label.py
```
- it will generate the prepared data in current folder, and then run
```shell
python ToVoc.py
```
- to convert the prepared training data into voc format. It will generate a folder named TEXTVOC. move this folder to data/ and then run
```shell
cd ../data
ln -s TEXTVOC VOCdevkit2007
```
## train
Simplely run
```shell
python ./ctpn/train_net.py
```
- you can modify some hyper parameters in ctpn/text.yml, or just used the parameters I set.
- The model I provided in checkpoints is trained on GTX1070 for 50k iters.
- If you are using cuda nms, it takes about 0.2s per iter. So it will takes about 2.5 hours to finished 50k iterations.
***
# roadmap
- [x] cython nms
- [x] cuda nms
- [x] python2/python3 compatblity
- [x] tensorflow1.3
- [x] delete useless code
- [x] loss function as referred in paper
- [x] oriented text connector
- [x] BLSTM
- [ ] side refinement
***
# some results
`NOTICE:` all the photos used below are collected from the internet. If it affects you, please contact me to delete them.
<img src="data/results/001.jpg" width=320 height=240 /><img src="data/results/002.jpg" width=320 height=240 />
<img src="data/results/003.jpg" width=320 height=240 /><img src="data/results/004.jpg" width=320 height=240 />
<img src="data/results/009.jpg" width=320 height=480 /><img src="data/results/010.png" width=320 height=320 />
***
## oriented text connector
- oriented text connector has been implemented, i's working, but still need futher improvement.
- left figure is the result for DETECT_MODE H, right figure for DETECT_MODE O
<img src="data/results/007.jpg" width=320 height=240 /><img src="data/results/007_oriented.jpg" width=320 height=240 />
<img src="data/results/008.jpg" width=320 height=480 /><img src="data/results/008_oriented.jpg" width=320 height=480 />
***
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
概述 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images目录,检测结果会保存到test_result中 python demo.py 模型训练 CTPN训练 详见ctpn/README.md DenseNet + CTC训练 1. 数据准备 数据集:https://pan.baidu.com/s/1QkI7kjah8SPHwOQ40rS1Pw (密码:lu7m) 共约364万张图片,按照99:1划分成训练集和验证集 数据利用中文语料库(新闻 + 文言文),通过字体、大小、灰度、模糊、透视、拉伸等变化随机生成 包含汉字、英文字母、数字和标点共5990个字符 每个样本固定10个字符,字符随机截取自语料库中的句子 图片分辨率统一为280x32 图片解压后放置到train/images目录下,描述文件放到train目录下
资源推荐
资源详情
资源评论
收起资源包目录
【毕业设计】基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别.zip (105个子文件)
cython_nms.c 358KB
bbox.c 319KB
gpu_nms.c 285KB
checkpoint 121B
gpu_nms.cpp 285KB
nms_kernel.cu 5KB
VGGnet_fast_rcnn_iter_50000.ckpt.data-00000-of-00001 68.26MB
.gitignore 148B
.gitkeep 0B
.gitkeep 0B
weights_densenet.h5 18.92MB
gpu_nms.hpp 146B
VGGnet_fast_rcnn_iter_50000.ckpt.index 2KB
001.jpg 1.26MB
demo.jpg 1.06MB
004.jpg 1.04MB
005.jpg 691KB
006.jpg 460KB
demo_detect.jpg 400KB
005.jpg 382KB
009.jpg 354KB
006.jpg 273KB
003.jpg 270KB
004.jpg 259KB
002.jpg 259KB
009.jpg 238KB
003.jpg 174KB
008_oriented.jpg 170KB
008.jpg 164KB
007_oriented.jpg 160KB
001.jpg 157KB
007.jpg 153KB
demo_rec.jpg 144KB
008.jpg 121KB
007.jpg 77KB
002.jpg 22KB
LICENSE 11KB
README.md 4KB
README.md 2KB
VGGnet_fast_rcnn_iter_50000.ckpt.meta 634KB
VGG_imagenet.npy 0B
010.png 67KB
010.png 13KB
network.py 18KB
keys.py 17KB
anchor_target_layer_tf.py 13KB
pascal_voc.py 10KB
config.py 10KB
train.py 9KB
minibatch.py 8KB
proposal_layer_tf.py 7KB
ToVoc.py 7KB
train.py 6KB
roidb.py 6KB
imdb.py 5KB
text_proposal_connector_oriented.py 4KB
split_label.py 4KB
demo.py 4KB
setup.py 4KB
text_detect.py 3KB
VGGnet_train.py 3KB
text_proposal_graph_builder.py 3KB
densenet.py 3KB
densenet.py 3KB
boxes_grid.py 3KB
ocr.py 3KB
bbox_transform.py 3KB
layer.py 3KB
text_proposal_connector.py 2KB
detectors.py 2KB
VGGnet_test.py 2KB
test.py 2KB
model.py 2KB
setup_cpu.py 2KB
blob.py 1KB
train_net.py 1KB
generate_anchors.py 1KB
ds_utils.py 1KB
other.py 1KB
factory.py 841B
demo.py 816B
__init__.py 554B
timer.py 552B
factory.py 467B
nms_wrapper.py 435B
text_connect_cfg.py 381B
__init__.py 172B
__init__.py 98B
__init__.py 86B
__init__.py 73B
__init__.py 24B
__init__.py 19B
__init__.py 0B
__init__.py 0B
__init__.py 0B
cython_nms.pyx 4KB
bbox.pyx 3KB
gpu_nms.pyx 1KB
setup.sh 356B
make.sh 134B
共 105 条
- 1
- 2
资源评论
小正太浩二
- 粉丝: 183
- 资源: 5909
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功