# AdvancedEAST
AdvancedEAST is an algorithm used for Scene image text detect,
which is primarily based on
[EAST:An Efficient and Accurate Scene Text Detector](https://arxiv.org/abs/1704.03155v2),
and the significant improvement was also made,
which make long text predictions more accurate.
If this project is helpful to you, welcome to star.
And if you have any problem, please contact me.
* email:yijie.huo@foxmail.com
* website:[https://huoyijie.cn](https://huoyijie.cn)
# advantages
* writen in keras, easy to read and run
* base on EAST, an advanced text detect algorithm
* easy to train the model
* significant improvement was made, long text predictions more accurate.(please
see 'demo results' part bellow,
and pay attention to the activation image,
which starts with yellow grids, and ends with green grids.)
In my experiments,
AdvancedEast has obtained much better prediction accuracy then East,
especially on long text. Since East calculates final vertexes coordinates with
weighted mean values of predicted vertexes coordinates of all pixels. It is too
difficult to predict the 2 vertexes from the other side of the quadrangle.
See East limitations picked from original paper bellow.
![East limitations](image/East.limitations.png "East limitations")
# project files
* config file:cfg.py,control parameters
* pre-process data:
preprocess.py,resize image
* label data:
label.py,produce label info
* define network
network.py
* define loss function
losses.py
* execute training
advanced_east.py and data_generator.py
* predict
predict.py and nms.py
**后置处理过程说明参见
[后置处理(含原理图)](https://huoyijie.cn/blog/82c8e470-7562-11ea-98d3-6d733527e90f/play)**
# network arch
* AdvancedEast
![AdvancedEast network arch](image/AdvancedEast.network.png "AdvancedEast network arch")
**网络输出说明:
输出层分别是1位score map, 是否在文本框内;2位vertex code,是否属于文本框边界像素以及是头还是尾;4位geo,是边界像素可以预测的2个顶点坐标。所有像素构成了文本框形状,然后只用边界像素去预测回归顶点坐标。边界像素定义为黄色和绿色框内部所有像素,是用所有的边界像素预测值的加权平均来预测头或尾的短边两端的两个顶点。头和尾部分边界像素分别预测2个顶点,最后得到4个顶点坐标。**
[原理简介(含原理图)](https://huoyijie.cn/blog/9a37ea00-755f-11ea-98d3-6d733527e90f/play)
* East
![East network arch](image/East.network.png "East network arch")
# setup
* python 3.6.3+
* tensorflow-gpu 1.5.0+(or tensorflow 1.5.0+)
* keras 2.1.4+
* numpy 1.14.1+
* tqdm 4.19.7+
# training
* tianchi ICPR dataset download
链接: https://pan.baidu.com/s/1NSyc-cHKV3IwDo6qojIrKA 密码: ye9y
* prepare training data:make data root dir(icpr),
copy images to root dir, and copy txts to root dir,
data format details could refer to 'ICPR MTWI 2018 挑战赛二:网络图像的文本检测',
[Link](https://tianchi.aliyun.com/competition/introduction.htm?spm=5176.100066.0.0.3bcad780oQ9Ce4&raceId=231651)
* modify config params in cfg.py, see default values.
* python preprocess.py, resize image to 256*256,384*384,512*512,640*640,736*736,
and train respectively could speed up training process.
* python label.py
* python advanced_east.py, train entrance
* python predict.py -p demo/001.png, to predict
* pretrain model download(use for test)
链接: https://pan.baidu.com/s/1KO7tR_MW767ggmbTjIJpuQ 密码: kpm2
# demo results
![001原图](demo/001.png "001原图")
![001激活图](demo/001.png_act.jpg "001激活图")
![001预测图](demo/001.png_predict.jpg "001预测图")
![004原图](demo/004.jpg "004原图")
![004激活图](demo/004.jpg_act.jpg "004激活图")
![004预测图](demo/004.jpg_predict.jpg "004预测图")
![005原图](demo/005.png "005原图")
![005激活图](demo/005.png_act.jpg "005激活图")
![005预测图](demo/005.png_predict.jpg "005预测图")
* compared with east based on vgg16
As you can see, although the text area prediction is very accurate, the vertex coordinates are not accurate enough.
![001激活图](demo/001.png_act_east.jpg "001激活图")
![001预测图](demo/001.png_predict_east.jpg "001预测图")
# License
The codes are released under the MIT License.
# references
* [EAST:An Efficient and Accurate Scene Text Detector](https://arxiv.org/abs/1704.03155v2)
* [CTPN:Detecting Text in Natural Image with Connectionist Text Proposal Network](https://arxiv.org/abs/1609.03605)
* [Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection](https://arxiv.org/abs/1703.01425)
**网络输出说明:
输出层分别是1位score map, 是否在文本框内;2位vertex code,是否属于文本框边界像素以及是头还是尾;4位geo,是边界像素可以预测的2个顶点坐标。所有像素构成了文本框形状,然后只用边界像素去预测回归顶点坐标。边界像素定义为黄色和绿色框内部所有像素,是用所有的边界像素预测值的加权平均来预测头或尾的短边两端的两个顶点。头和尾部分边界像素分别预测2个顶点,最后得到4个顶点坐标。**
[原理简介(含原理图)](https://huoyijie.cn/blog/9a37ea00-755f-11ea-98d3-6d733527e90f/play)
**后置处理过程说明参见
[后置处理(含原理图)](https://huoyijie.cn/blog/82c8e470-7562-11ea-98d3-6d733527e90f/play)**
[A Simple RaspberryPi Car Project](https://github.com/huoyijie/raspberrypi-car)
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
基于深度学习的轮胎字符识别python实现源码+项目使用说明(95分以上大作业).zip个人经导师指导并认可通过的98分大作业设计项目,主要针对计算机相关专业的正在做课程设计、期末大作业的学生和需要项目实战练习的学习者。 基于深度学习的轮胎字符识别python实现源码+项目使用说明(95分以上大作业).zip个人经导师指导并认可通过的98分大作业设计项目,主要针对计算机相关专业的正在做课程设计、期末大作业的学生和需要项目实战练习的学习者。 基于深度学习的轮胎字符识别python实现源码+项目使用说明(95分以上大作业).zip个人经导师指导并认可通过的98分大作业设计项目,主要针对计算机相关专业的正在做课程设计、期末大作业的学生和需要项目实战练习的学习者。 基于深度学习的轮胎字符识别python实现源码+项目使用说明(95分以上大作业).zip个人经导师指导并认可通过的98分大作业设计项目,主要针对计算机相关专业的正在做课程设计、期末大作业的学生和需要项目实战练习的学习者。基于深度学习的轮胎字符识别python实现源码+项目使用说明(95分以上大作业).zip
资源推荐
资源详情
资源评论
收起资源包目录
基于深度学习的轮胎字符识别python实现源码+项目使用说明(95分以上大作业).zip (156个子文件)
Cache.cach 0B
手册.docx 5.66MB
inference.pdiparams.info 26KB
inference.pdiparams.info 26KB
inference.pdiparams.info 26KB
inference.pdiparams.info 21KB
inference.pdiparams.info 21KB
inference.pdiparams.info 21KB
Result_5.jpg 10.72MB
Result_6.jpg 10.69MB
Result_12.jpg 10.64MB
Result_13.jpg 10.63MB
Result_10.jpg 7.67MB
Result_11.jpg 7.49MB
Result_14.jpg 7.45MB
Result_2.jpg 7.31MB
Result_3.jpg 6.61MB
Result_4.jpg 6.56MB
Result_1.jpg 5.38MB
Result_7.jpg 5.31MB
Result_9.jpg 3.61MB
Result_8.jpg 3.57MB
004.jpg 197KB
007.png_act.jpg 86KB
004.jpg_act.jpg 83KB
004.jpg_predict.jpg 79KB
012.png_act.jpg 78KB
007.png_predict.jpg 75KB
012.png_predict.jpg 69KB
005.png_act.jpg 66KB
005.png_predict.jpg 66KB
001.png_act.jpg 47KB
001.png_predict.jpg 47KB
001.png_act_east.jpg 25KB
001.png_predict_east.jpg 25KB
LICENSE 1KB
train.log 116KB
PCGrid2D20220115.m 6KB
README.md 5KB
inference.pdiparams 8.5MB
inference.pdiparams 8.5MB
inference.pdiparams 8.5MB
inference.pdiparams 2.27MB
inference.pdiparams 2.27MB
inference.pdiparams 2.27MB
inference.pdmodel 1.67MB
inference.pdmodel 1.52MB
inference.pdmodel 1.52MB
inference.pdmodel 997KB
inference.pdmodel 997KB
inference.pdmodel 997KB
latest.pdopt 138B
latest.pdparams 2.38MB
image-20230130021643762.png 13.05MB
picture_7.PNG 8.93MB
picture_7.PNG 8.93MB
picture_8.PNG 8.86MB
picture_8.PNG 8.86MB
picture_13.PNG 8.04MB
picture_13.PNG 8.04MB
picture_14.PNG 8.02MB
picture_14.PNG 8.02MB
picture_5.PNG 6.13MB
picture_5.PNG 6.13MB
picture_6.PNG 6.11MB
picture_6.PNG 6.11MB
picture_9.PNG 5.89MB
picture_9.PNG 5.89MB
picture_10.PNG 5.84MB
picture_10.PNG 5.84MB
picture_11.PNG 5.42MB
picture_11.PNG 5.42MB
picture_12.PNG 5.32MB
picture_12.PNG 5.32MB
picture_2.PNG 4.68MB
picture_2.PNG 4.68MB
picture_1.PNG 4.68MB
picture_1.PNG 4.68MB
image-20230130025814421.png 4.14MB
picture_4.PNG 2.98MB
picture_4.PNG 2.98MB
picture_3.PNG 2.96MB
picture_3.PNG 2.96MB
image-20230130142811133.png 1.34MB
007.png 742KB
svtr_tiny.png 726KB
012.png 605KB
image-20230130225711744.png 436KB
001.png 360KB
image-20230130225149165.png 246KB
image-20220909113850254.png 211KB
East.network.png 191KB
005.png 174KB
image-20230131002104733.png 162KB
image-20230130030011420.png 151KB
image-20230131002406619.png 148KB
image-20220821110913807.png 146KB
East.limitations.png 135KB
image-20230130224016405.png 124KB
image-20230130233914877.png 124KB
共 156 条
- 1
- 2
资源评论
程序员张小妍
- 粉丝: 1w+
- 资源: 3115
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功