# text-detection-ctpn
text detection mainly based on ctpn (connectionist text proposal network). It is implemented in tensorflow. I use id card detect as an example to demonstrate the results, but it should be noticing that this model can be used in almost every horizontal scene text detection task. The origin paper can be found [here](https://arxiv.org/abs/1609.03605). Also, the origin repo in caffe can be found in [here](https://github.com/tianzhi0549/CTPN). For more detail about the paper and code, see this [blog](http://slade-ruan.me/2017/10/22/text-detection-ctpn/)
***
# setup
- requirements: tensorflow1.3, cython0.24, opencv-python, easydict,(recommend to install Anaconda)
- if you do not have a gpu device,follow here to [setup](https://github.com/eragonruan/text-detection-ctpn/issues/43)
- if you have a gpu device, build the library by
```shell
cd lib/utils
chmod +x make.sh
./make.sh
```
***
# parameters
there are some parameters you may need to modify according to your requirement, you can find them in ctpn/text.yml
- USE_GPU_NMS # whether to use nms implemented in cuda or not
- DETECT_MODE # H represents horizontal mode, O represents oriented mode, default is H
- checkpoints_path # the model I provided is in checkpoints/, if you train the model by yourself,it will be saved in output/
***
# demo
- put your images in data/demo, the results will be saved in data/results, and run demo in the root
```shell
python ./ctpn/demo.py
```
***
# training
## prepare data
- First, download the pre-trained model of VGG net and put it in data/pretrain/VGG_imagenet.npy. you can download it from [google drive](https://drive.google.com/open?id=0B_WmJoEtfQhDRl82b1dJTjB2ZGc) or [baidu yun](https://pan.baidu.com/s/1kUNTl1l).
- Second, prepare the training data as referred in paper, or you can download the data I prepared from previous link. Or you can prepare your own data according to the following steps.
- Modify the path and gt_path in prepare_training_data/split_label.py according to your dataset. And run
```shell
cd prepare_training_data
python split_label.py
```
- it will generate the prepared data in current folder, and then run
```shell
python ToVoc.py
```
- to convert the prepared training data into voc format. It will generate a folder named TEXTVOC. move this folder to data/ and then run
```shell
cd ../data
ln -s TEXTVOC VOCdevkit2007
```
## train
Simplely run
```shell
python ./ctpn/train_net.py
```
- you can modify some hyper parameters in ctpn/text.yml, or just used the parameters I set.
- The model I provided in checkpoints is trained on GTX1070 for 50k iters.
- If you are using cuda nms, it takes about 0.2s per iter. So it will takes about 2.5 hours to finished 50k iterations.
***
# roadmap
- [x] cython nms
- [x] cuda nms
- [x] python2/python3 compatblity
- [x] tensorflow1.3
- [x] delete useless code
- [x] loss function as referred in paper
- [x] oriented text connector
- [x] BLSTM
- [ ] side refinement
***
# some results
`NOTICE:` all the photos used below are collected from the internet. If it affects you, please contact me to delete them.
<img src="data/oriented_results/001.jpg" width=320 height=240 /><img src="data/oriented_results/002.jpg" width=320 height=240 />
<img src="data/oriented_results/003.jpg" width=320 height=240 /><img src="data/oriented_results/004.jpg" width=320 height=240 />
<img src="data/oriented_results/009.jpg" width=320 height=480 /><img src="data/oriented_results/010.png" width=320 height=320 />
***
## oriented text connector
- oriented text connector has been implemented, i's working, but still need futher improvement.
- left figure is the result for DETECT_MODE H, right figure for DETECT_MODE O
<img src="data/oriented_results/007.jpg" width=320 height=240 /><img src="data/oriented_results/007.jpg" width=320 height=240 />
<img src="data/oriented_results/008.jpg" width=320 height=480 /><img src="data/oriented_results/008.jpg" width=320 height=480 />
***
没有合适的资源?快使用搜索试试~ 我知道了~
毕业设计基于tensorflow、keras-pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别
共146个文件
py:70个
jpg:29个
png:6个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 5 浏览量
2024-12-02
11:00:44
上传
评论
收藏 150.75MB ZIP 举报
温馨提示
毕业设计基于tensorflow、keras_pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别,该项目是个人毕设项目,答辩评审分达到98分,代码都经过调试测试,确保可以运行!欢迎下载使用,可用于小白学习、进阶。该资源主要针对计算机、通信、人工智能、自动化等相关专业的学生、老师或从业者下载使用,亦可作为期末课程设计、课程大作业、毕业设计等。项目整体具有较高的学习借鉴价值!基础能力强的可以在此基础上修改调整,以实现不同的功能。 毕业设计基于tensorflow、keras_pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别毕业设计基于tensorflow、keras_pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别毕业设计基于tensorflow、keras_pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别毕业设计基于tensorflow、keras_pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别毕业设计基于tensorflow、keras_pytorch实现对自然场景的文字检测及端到端的OCR中文文
资源推荐
资源详情
资源评论
收起资源包目录
毕业设计基于tensorflow、keras-pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别 (146个子文件)
cython_nms.c 357KB
bbox.c 318KB
gpu_nms.c 284KB
checkpoint 121B
gpu_nms.cpp 285KB
nms_kernel.cu 5KB
VGGnet_fast_rcnn_iter_50000.ckpt.data-00000-of-00001 68.26MB
.gitignore 189B
.gitignore 103B
weights-crnn-0.2.h5 0B
modelAngle.h5 0B
keras.hdf5 40.07MB
gpu_nms.hpp 146B
untitled.iml 520B
VGGnet_fast_rcnn_iter_50000.ckpt.index 2KB
demo.ipynb 2KB
6.jpg 1.92MB
005.jpg 691KB
006.jpg 460KB
005.jpg 382KB
009.jpg 354KB
7.jpg 310KB
4.jpg 291KB
006.jpg 273KB
004.jpg 259KB
009.jpg 238KB
tmp.jpg 231KB
004.jpg 227KB
001.jpg 206KB
10.jpg 194KB
003.jpg 174KB
tmp1.jpg 173KB
008.jpg 170KB
007.jpg 160KB
001.jpg 157KB
3.jpg 126KB
008.jpg 121KB
5.jpg 103KB
2.jpg 88KB
007.jpg 77KB
003.jpg 61KB
002.jpg 49KB
1.jpg 25KB
002.jpg 22KB
ff299a9c-b41b-11e7-89e1-1c1b0d6ddf51.jpg 4KB
LICENSE 1KB
README.md 4KB
README.md 2KB
data.mdb 312KB
data.mdb 312KB
lock.mdb 8KB
lock.mdb 8KB
VGGnet_fast_rcnn_iter_50000.ckpt.meta 634KB
VGG_imagenet.npy 0B
9.png 1.16MB
tmp.png 560KB
8.png 522KB
010.png 67KB
tmp1.png 34KB
010.png 13KB
netCRNN.pth 42.55MB
model_acc97.pth 0B
network.py 18KB
keys.py 16KB
keys.py 16KB
keys.py 16KB
keys.py 16KB
anchor_target_layer_tf.py 13KB
pascal_voc.py 10KB
config.py 10KB
train.py 10KB
train.py 9KB
minibatch.py 8KB
proposal_layer_tf.py 7KB
ToVoc.py 7KB
roidb.py 6KB
imdb.py 5KB
text_proposal_connector_oriented.py 4KB
model.py 4KB
model.py 4KB
split_label.py 4KB
dataset.py 4KB
dataset.py 4KB
demo.py 4KB
dataset.py 4KB
setup.py 4KB
utils.py 4KB
text_detect.py 3KB
VGGnet_train.py 3KB
utils.py 3KB
text_proposal_graph_builder.py 3KB
create_dataset.py 3KB
boxes_grid.py 3KB
crnn.py 3KB
crnn.py 3KB
train.py 3KB
model.py 3KB
bbox_transform.py 3KB
layer.py 3KB
text_proposal_connector.py 2KB
共 146 条
- 1
- 2
资源评论
yava_free
- 粉丝: 4044
- 资源: 1550
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- base(1).apk.1
- K618977005_2012-12-6_beforeP_000.txt.PRM
- 秋招信息获取与处理基础教程
- 程序员面试笔试面经技巧基础教程
- Python实例-21个自动办公源码-数据处理技术+Excel+自动化脚本+资源管理
- 全球前8GDP数据图(python动态柱状图)
- 汽车检测7-YOLO(v5至v9)、COCO、CreateML、Darknet、Paligemma、TFRecord、VOC数据集合集.rar
- 检测高压线电线-YOLO(v5至v9)、COCO、Darknet、VOC数据集合集.rar
- 检测行路中的人脸-YOLO(v5至v9)、COCO、CreateML、Darknet、Paligemma、VOC数据集合集.rar
- Image_17083039753012.jpg
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功