# EAST : An Efficient and Accurate Scene Text Detector
### Introduction
This is a tensorflow implemention of EAST. I only reimplement the RBOX part of the paper, which achieves an F1 score
of 80.8 on the ICDAR 2015 dataset (which is about two points better than the result of pvanet in the paper, see http://rrc.cvc.uab.es/?ch=4&com=evaluation&task=1). The running speed is about 150ms (network) + 300ms (NMS) per image on a K40 card. The nms part is too slow because of the use of shapely in python, and can be further improved.
Thanks for the author's ([@zxytim](https://github.com/zxytim)) help!
Please site his [paper](https://arxiv.org/abs/1704.03155v2) if you find this useful.
### Contents
1. [Installation](#installation)
2. [Download](#download)
3. [Test](#train)
4. [Train](#test)
5. [Examples](#examples)
### Installation
1. I think any version of tensorflow version > 1.0 should be ok.
### Download
1. Models trained on ICDAR 2013 (training set) + ICDAR 2015 (training set): [BaiduYun link](http://pan.baidu.com/s/1jHWDrYQ)
2. Resnet V1 50 provided by tensorflow slim: [slim resnet v1 50](http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz)
### Train
If you want to train the model, you should provide the dataset path, in the dataset path, a separate gt text file should be provided for each image
and run
```
python multigpu_train.py --gpu_list=0 --input_size=512 --batch_size=14 --checkpoint_path=/tmp/east_icdar2015_resnet_v1_50_rbox/ \
--text_scale=512 --training_data_path=/data/ocr/icdar2015/ --geometry=RBOX --learning_rate=0.0001 --num_readers=24 \
--pretrained_model_path=/tmp/resnet_v1_50.ckpt
```
If you have more than one gpu, you can pass gpu ids to gpu_list
**Note: you should change the gt text file of icdar2015's filename to img_\*.txt instead of gt_img_\*.txt(or you can change the code in icdar.py), and some extra characters should be removed from the file.**
### Test
run
```
python eval.py --test_data_path=/tmp/images/ --gpu_list=0 --checkpoint_path=/tmp/east_icdar2015_resnet_v1_50_rbox/ \
--output_path=/tmp/
```
a text file will be then written to the output path.
### Examples
Here is some test examples on icdar2015, enjoy the beautiful text boxes!
![image_1](Examples/img_2.jpg)
![image_2](Examples/img_10.jpg)
![image_3](Examples/img_14.jpg)
![image_4](Examples/img_26.jpg)
![image_5](Examples/img_75.jpg)
Please let me know if you encounter any issues(my email boostczc@gmail dot com).
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
自然场景下文字识别.rar (15个子文件)
EAST-master
eval.py 7KB
readme.md 2KB
nets
resnet_v1.py 15KB
resnet_utils.py 11KB
multigpu_train.py 8KB
LICENSE 34KB
Examples
img_14.jpg 115KB
img_2.jpg 225KB
img_26.jpg 208KB
img_10.jpg 330KB
img_75.jpg 171KB
model.py 5KB
.gitignore 1KB
icdar.py 30KB
locality_aware_nms.py 2KB
共 15 条
- 1
资源评论
thanghan20
- 粉丝: 12
- 资源: 6
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 南京邮电大学数学实验:熟练掌握 Matlab 软件的基本命令和操作
- 2017校招真题校园招聘真题算法题(37道)Python源码.zip
- 基于单片机protues仿真的多功能自动饮水机系统设计(仿真图、源代码、演示视频)
- 二叉树7-1-1.cpp
- android 9.0 原生模拟器 签名文件
- 技术面试最后反问面试官的话 校招面试非技术问题有哪些 非技术问题如何回答.png
- NB-IOT-BC26全网通模块Altium+ CADENCE +PADS三种格式(原理图SCH+PCB封装库)文件.zip
- 基于微信小程序开发的校园失物招领系统源码毕业设计(优质项目源码).zip
- 词向量是一种将自然语言中的单词转换为数值向量的技术,它能够捕捉词义和上下文信息
- nmap与masscan的简单使用
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功