Handwritten Text Recognition (HTR) system implemented using Pytorch and trained on the Bentham/IAM/Rimes/Saint Gall/Washington offline HTR datasets. This Neural Network model recognizes the text contained in the images of segmented texts lines.
Data pre-processing is totally based on this awesome repository of [handwritten text recognition](https://github.com/arthurflor23/handwritten-text-recognition).
Data partitioning (train, validation, test) was performed by following the methodology of each dataset.
Model building is done using the transformer architecture.
Recentely facebook research realeased a [paper](https://github.com/facebookresearch/detr) where, they used transformer for object detection. I made few changes to their model so that it could be run on text recognition.
## Tutorial (Google Colab/Drive)
A Jupyter Notebook is available for demo, check out the **[tutorial](https://colab.research.google.com/drive/1rCPaksWk7SAH4crOVYVzUaWsKbz2i3jE?authuser=1#scrollTo=rQew0_CkacDU)** on Google Colab/Drive.
## Datasets supported
a. [Bentham](http://transcriptorium.eu/datasets/bentham-collection/)
b. [IAM](http://www.fki.inf.unibe.ch/databases/iam-handwriting-database)
c. [Rimes](http://www.a2ialab.com/doku.php?id=rimes_database:start)
d. [Saint Gall](https://fki.tic.heia-fr.ch/databases/saint-gall-database)
e. [Washington](https://fki.tic.heia-fr.ch/databases/washington-database)
## Requirements
- Python 3.6
- OpenCV 4.x
- editdistance
- Pytorch 1.5
## Command line arguments
- `--source`: dataset/model name (bentham, iam, rimes, saintgall, washington)
- `--transform`: transform dataset to the HDF5 file
- `--image`: prediction on a single image with the source parameter
- `--train`: train model using the source argument
- `--test`: evaluate and predict model using the source argument
- `--norm_accentuation`: discard accentuation marks in the evaluation
- `--norm_punctuation`: discard punctuation marks in the evaluation
- `--epochs`: number of epochs
- `--batch_size`: number of the size of each batch
- `--lr`: Learning rate
**Notes**:
* Model used is from DETR(facebook research) notebook but in there paper they perfromed few more steps.
* For improving the results few more things can be done:
* Using the warmup steps
* Using sine positional encodings for image vector.
* Trying more FC layers before output.
* Trying different parameters of Transformer.
* Trying different backbone model for getting feature vector of image.
* Training took ~20 hrs on google colab. where as [arthurflor](https://github.com/arthurflor23/handwritten-text-recognition) model can be trained in ~8hrs.
* Word error rate is 15% less when compared to Arthur's model on bentham dataset.
* Purpose of this project was to showcase the power of Transformer ie: You can use them anywhere.
没有合适的资源?快使用搜索试试~ 我知道了~
手写字识别-基于Transformer实现手写字文本识别-附项目源码-优质项目实战.zip
共12个文件
py:9个
txt:1个
md:1个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 187 浏览量
2024-05-12
20:40:59
上传
评论
收藏 76KB ZIP 举报
温馨提示
手写字识别_基于Transformer实现手写字文本识别_附项目源码_优质项目实战
资源推荐
资源详情
资源评论
收起资源包目录
手写字识别_基于Transformer实现手写字文本识别_附项目源码_优质项目实战.zip (12个子文件)
手写字识别_基于Transformer实现手写字文本识别_附项目源码_优质项目实战
src
main.py 7KB
data
generator.py 3KB
evaluation.py 2KB
__init__.py 0B
reader.py 9KB
preproc.py 12KB
network
__init__.py 0B
model.py 4KB
engine.py 5KB
Notebook
Transformer_ocr.ipynb 122KB
requirements.txt 103B
Readme.md 3KB
共 12 条
- 1
资源评论
__AtYou__
- 粉丝: 1770
- 资源: 602
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功