## 使用手册
1.首先生成训练和验证数据
```
python3 create_lmdb_dataset.py --inputPath data/ --gtFile data/labelfile --outputPath result/
```
生成数据后缀名为.mdb
2.运行如下命令进行训练
```
python3 train.py --train_data datapath --valid_data datapath --saved_model modelpath --saved_model modelpath
```
datapath就是第一步生成数据的路径,saved_model可指定检查点
可根据需要添加参数,参数含义参考下面的文档
3.验证训练结果
```
python3 demo.py --image_folder imgpath --saved_model modelpath --labelFilePath path
```
labelFilePath指定预测图片标注文件,如果要统计预测结果或可视化就需指定
4.直接在代码中设定的参数(更改default就行了)
```
train.py:
num_iter: 训练轮数
valInterval: 中间结果验证相隔的轮数
workers:加载数据的并行进程数,设置为非0值可能会报错
demo.py:
visual: 结果可视化
```
5.模型下载地址
链接:https://pan.baidu.com/s/1YDRVt82JtIQCfXNHxM4BZw
提取码:z81d
# What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis
| [paper](https://arxiv.org/abs/1904.01906) | [training and evaluation data](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here) | [failure cases and cleansed label](https://github.com/clovaai/deep-text-recognition-benchmark#download-failure-cases-and-cleansed-label-from-here) | [pretrained model](https://drive.google.com/drive/folders/15WPsuPJDCzhp2SvYZLRj8mAlT3zmoAMW) | [Baidu ver(passwd:rryk)](https://pan.baidu.com/s/1KSNLv4EY3zFWHpBYlpFCBQ) |
Official PyTorch implementation of our four-stage STR framework, that most existing STR models fit into. <br>
Using this framework allows for the module-wise contributions to performance in terms of accuracy, speed, and memory demand, under one consistent set of training and evaluation datasets. <br>
Such analyses clean up the hindrance on the current comparisons to understand the performance gain of the existing modules. <br><br>
<img src="./figures/trade-off.png" width="1000" title="trade-off">
## Honors
Based on this framework, we recorded the 1st place of [ICDAR2013 focused scene text](https://rrc.cvc.uab.es/?ch=2&com=evaluation&task=3), [ICDAR2019 ArT](https://rrc.cvc.uab.es/files/ICDAR2019-ArT.pdf) and 3rd place of [ICDAR2017 COCO-Text](https://rrc.cvc.uab.es/?ch=5&com=evaluation&task=2), [ICDAR2019 ReCTS (task1)](https://rrc.cvc.uab.es/files/ICDAR2019-ReCTS.pdf). <br>
The difference between our paper and ICDAR challenge is summarized [here](https://github.com/clovaai/deep-text-recognition-benchmark/issues/13).
## Updates
**Dec 27, 2019**: added [FLOPS](https://github.com/clovaai/deep-text-recognition-benchmark/issues/125) in our paper, and minor updates such as log_dataset.txt and [ICDAR2019-NormalizedED](https://github.com/clovaai/deep-text-recognition-benchmark/blob/86451088248e0490ff8b5f74d33f7d014f6c249a/test.py#L139-L165). <br>
**Oct 22, 2019**: added [confidence score](https://github.com/clovaai/deep-text-recognition-benchmark/issues/82), and arranged the output form of training logs. <br>
**Jul 31, 2019**: The paper is accepted at International Conference on Computer Vision (ICCV), Seoul 2019, as an oral talk. <br>
**Jul 25, 2019**: The code for floating-point 16 calculation, check [@YacobBY's](https://github.com/YacobBY) [pull request](https://github.com/clovaai/deep-text-recognition-benchmark/pull/36) <br>
**Jul 16, 2019**: added [ST_spe.zip](https://drive.google.com/drive/folders/192UfE9agQUMNq6AgU3_E05_FcPZK4hyt) dataset, word images contain special characters in SynthText (ST) dataset, see [this issue](https://github.com/clovaai/deep-text-recognition-benchmark/issues/7#issuecomment-511727025) <br>
**Jun 24, 2019**: added gt.txt of failure cases that contains path and label of each image, see [image_release_190624.zip](https://drive.google.com/open?id=1VAP9l5GL5fgptgKDLio_h3nMe7X9W0Mf) <br>
**May 17, 2019**: uploaded resources in Baidu Netdisk also, added [Run demo](https://github.com/clovaai/deep-text-recognition-benchmark#run-demo-with-pretrained-model). (check [@sharavsambuu's](https://github.com/sharavsambuu) [colab demo also](https://colab.research.google.com/drive/1PHnc_QYyf9b1_KJ1r15wYXaOXkdm1Mrk)) <br>
**May 9, 2019**: PyTorch version updated from 1.0.1 to 1.1.0, use torch.nn.CTCLoss instead of torch-baidu-ctc, and various minor updated.
## Getting Started
### Dependency
- This work was tested with PyTorch 1.3.1, CUDA 10.1, python 3.6 and Ubuntu 16.04. <br> You may need `pip3 install torch==1.3.1`. <br>
In the paper, expriments were performed with **PyTorch 0.4.1, CUDA 9.0**.
- requirements : lmdb, pillow, torchvision, nltk, natsort
```
pip3 install lmdb pillow torchvision nltk natsort
```
### Download lmdb dataset for traininig and evaluation from [here](https://pan.baidu.com/s/1IJhghbBrQwcU3Cx2iXv7zw) 提取码:mlnc
data_lmdb_release.zip contains below. <br>
training datasets : [MJSynth (MJ)](http://www.robots.ox.ac.uk/~vgg/data/text/)[1] and [SynthText (ST)](http://www.robots.ox.ac.uk/~vgg/data/scenetext/)[2] \
validation datasets : the union of the training sets [IC13](http://rrc.cvc.uab.es/?ch=2)[3], [IC15](http://rrc.cvc.uab.es/?ch=4)[4], [IIIT](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html)[5], and [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset)[6].\
evaluation datasets : benchmark evaluation datasets, consist of [IIIT](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html)[5], [SVT](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset)[6], [IC03](http://www.iapr-tc11.org/mediawiki/index.php/ICDAR_2003_Robust_Reading_Competitions)[7], [IC13](http://rrc.cvc.uab.es/?ch=2)[3], [IC15](http://rrc.cvc.uab.es/?ch=4)[4], [SVTP](http://openaccess.thecvf.com/content_iccv_2013/papers/Phan_Recognizing_Text_with_2013_ICCV_paper.pdf)[8], and [CUTE](http://cs-chan.com/downloads_CUTE80_dataset.html)[9].
### Run demo with pretrained model
1. Download pretrained model from [here](https://drive.google.com/drive/folders/15WPsuPJDCzhp2SvYZLRj8mAlT3zmoAMW)
2. Add image files to test into `demo_image/`
3. Run demo.py (add `--sensitive` option if you use case-sensitive model)
```
CUDA_VISIBLE_DEVICES=0 python3 demo.py --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --image_folder demo_image/ --saved_model TPS-ResNet-BiLSTM-Attn.pth
```
#### prediction results
| demo images | [TPS-ResNet-BiLSTM-Attn](https://drive.google.com/open?id=1b59rXuGGmKne1AuHnkgDzoYgKeETNMv9) | [TPS-ResNet-BiLSTM-Attn (case-sensitive)](https://drive.google.com/open?id=1ajONZOgiG9pEYsQ-eBmgkVbMDuHgPCaY) |
| --- | --- | --- |
| <img src="./demo_image/demo_1.png" width="300"> | available | Available |
| <img src="./demo_image/demo_2.jpg" width="300"> | shakeshack | SHARESHACK |
| <img src="./demo_image/demo_3.png" width="300"> | london | Londen |
| <img src="./demo_image/demo_4.png" width="300"> | greenstead | Greenstead |
| <img src="./demo_image/demo_5.png" width="300" height="100"> | toast | TOAST |
| <img src="./demo_image/demo_6.png" width="300" height="100"> | merry | MERRY |
| <img src="./demo_image/demo_7.png" width="300"> | underground | underground |
| <img src="./demo_image/demo_8.jpg" width="300"> | ronaldo | RONALDO |
| <img src="./demo_image/demo_9.jpg" width="300" height="100"> | bally | BALLY |
| <img src="./demo_image/demo_10.jpg" width="300" height="100"> | university | UNIVERSITY |
### Training and evaluation
1. Train CRNN[10] model
```
CUDA_VISIBLE_DEVICES=0 python3 train.py \
--train_data data_lmdb_release/training --valid_data data_lmdb_release/validation \
--select_data MJ-ST --batch_ratio 0.5-0.5 \
--Transformation None --Feat
没有合适的资源?快使用搜索试试~ 我知道了~
毕设&课程作业_“国土资源智能文档解析系统”项目.zip
共97个文件
jpg:30个
py:26个
png:19个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 159 浏览量
2024-01-31
14:53:16
上传
评论
收藏 12.94MB ZIP 举报
温馨提示
计算机类毕业设计、课程作业,系统源码!!!
资源推荐
资源详情
资源评论
收起资源包目录
毕设&课程作业_“国土资源智能文档解析系统”项目.zip (97个子文件)
Graduation Design
environment.txt 929B
predict.py 3KB
TPS
utils.py 5KB
LICENSE.md 11KB
log_demo_result.txt 7KB
modules
feature_extraction.py 10KB
sequence_modeling.py 746B
prediction.py 4KB
transformation.py 8KB
dataset.py 13KB
utils
number2words.py 939B
words2number.py 1KB
rm_index.py 254B
characters_list.txt 702B
demo.ipynb 780KB
model.py 4KB
demo_image
demo_5.png 44KB
demo_9.jpg 149KB
demo_6.png 9KB
demo_8.jpg 14KB
demo_3.png 175KB
demo_1.png 19KB
SRN.png 187KB
demo_7.png 25KB
demo_2.jpg 7KB
demo_4.png 86KB
demo_10.jpg 43KB
visual
simsun.ttc 10.01MB
testImg
list.txt 11KB
figures
trade-off.png 1.2MB
failure-case.jpg 516KB
create_lmdb_dataset.py 3KB
train.py 14KB
demo.py 8KB
test.py 13KB
README.md 15KB
CRAFT
craft.py 3KB
imgproc.py 19KB
weights
download.txt 75B
data
91.png 21KB
67.png 17KB
1.png 13KB
4.png 17KB
craft_utils.py 9KB
file_utils.py 3KB
fail_log
fail_log.txt 509B
basenet
__init__.py 1B
vgg16_bn.py 3KB
__pycache__
vgg16_bn.cpython-36.pyc 2KB
__init__.cpython-36.pyc 114B
requirements.txt 2KB
polar-img
polar_1.jpg 8KB
polar_67.jpg 6KB
polar_91.jpg 5KB
polar_4.jpg 6KB
refinenet.py 2KB
服务器端检测 后处理测试.png 119KB
test.py 6KB
result
res_67_mask.jpg 20KB
res_1.txt 29B
res_91_mask.jpg 21KB
res_67.txt 98B
res_91.jpg 35KB
res_4.txt 28B
res_1.jpg 44KB
res_4.jpg 46KB
res_67.jpg 42KB
res_91.txt 29B
res_1_mask.jpg 27KB
res_4_mask.jpg 23KB
readme_fig
horizontal_text_recognition.png 167KB
curved_text_processing_flow.png 395KB
synthetic_dataset.png 151KB
task_decomposition.png 327KB
color_separation.png 300KB
raw_input.png 359KB
creat_dataset
utils.py 9KB
characters.txt 511B
synthetic_data.py 10KB
Aug_Operations.py 67KB
utils.pyc 12KB
Aug_Operations.pyc 58KB
fonts
GB2312.ttf 3.81MB
background
cat_dog
bg15.jpg 11KB
bg16.jpg 14KB
bg1.jpg 13KB
bg17.jpg 15KB
bg4.jpg 15KB
bg11.jpg 14KB
bg12.jpg 15KB
bg8.jpg 14KB
bg7.jpg 15KB
bg10.jpg 18KB
bg9.jpg 13KB
bg14.jpg 15KB
bg6.jpg 9KB
README.md 3KB
共 97 条
- 1
资源评论
学术菜鸟小晨
- 粉丝: 1w+
- 资源: 5005
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 堆排序(Heap Sort)是一种基于比较的排序算法
- ebatis 是一个简单方便上手的声明式 Elasticsearch ORM 框架
- 威纶通触摸屏编程软件Easy builder pro V6.09.02安装包(2024.06).txt
- ES查询客户端,elasticsearch可视化工具 elasticsearch查询客户端
- html css js网页制作实例 dldtdd实现列表功能
- 用python制作的tts语音小工具
- 三菱PLC编程参考手册
- 吃豆人代码源码全套.cpp
- 快速了解学习「编译原理」都需要掌握哪些基础知识.pdf
- Verilog示例代码,以SMIC 12nm工艺库为例给出Tessent TCL脚本示例
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功