ocrad.js
========
As with any minor stepping stone on the <strikeout>road to hell</strikeout> relentless trajectory of <link>Atwood's Law, I probably don't need to justify the existence of yet another "x, but now in Javascript!", but I might as well try. After all, we all would like to think that there's some ulterior motive to fulfilling that prophecy.
On tablet or other touchscreen devices- of which there are quite a number of nowadays (as the New Year's Eve post, I am obliged to include conjecture about the technological zeitgeist), a library such as Ocrad.js might be used to add handwriting input in a device and operating system agnostic manner. Oftentimes, capturing the strokes and sending them over to a server to process might entail unacceptably high latency. Maybe you're working on an offline-capable note-taking app, or a browser extension which indexes all the doge memes that you stumble upon while prawling the dark corners of the internet.
If you've been following my trail of blog posts recently, you'd probably be able to tell that I've been scrambling to finish the program that I prototyped many months ago overnight at a Hackathon. The idea of the extension was kind of simple and also kind of magical: a browser extension that allowed users to highlight, copy, and paste text from any image as if it were plain text. Of course the implementation is a bit difficult and actually relies on the advent of a number of newfangled technologies.
If you try to search for some open source text recognition engine, the first thing that comes up is Tesseract. That isn't a mistake, because it turns out that the competition is worlds away in terms of accuracy. It's actually pretty sad that the state of the art hasn't progressed substantially since the mid-nineties.
A month ago, I tried compiling Tesseract using Emscripten. Perhaps it was a bad thing to try first, but soon I learned that even if it did work out, it probably wouldn't have been practical anyway. I had figured that all OCR engines had been powered by artificial neural networks, support vector machines, k-nearest-neighbors and their machine learning kin. It turns out that this is hardly the norm except in the realm of the actually-accurate, whose open source provinces live under the protection of Lord Tesseract.
GOCR and Ocrad are essentially the only other open source OCR engines (there's technically also Cuneiform, but the source code is in a really really big zip file from some website in Russian and its also really slow according to benchmarks). And something I didn't realize until I had peered into the source code is that they are powered by (presumably) painstakingly written rules for each and every detectable glyph and variation. This kind of blew my mind.
Anyway, I tried to compile GOCR first and was immediately struck by how easy and painless it had been. I was on a roll, and decided to do Ocrad as well. It wasn't particularly hard- sure it was slightly more involved but still hardly anything.
If you know me in person, you'll probably know that I'm not a terribly decisive person. Oftentimes, I'll delay the decision until there isn't a choice left for me to make. Anyway, serially-indecisive-me strikes again, so I alternated between the development of GOCR.js and Ocrad.js, leading up to a simultaneous release.
But in the back of my mind, I knew that eventually I would have to pick one for building my image highlighting project.
What consistently amazes me about Optical Character Recognition isn't its astonishing quality or lack thereof. Rather, it's how utterly unpredictable the results can be. Sometimes there'll be some barely legible block of text that comes through absolutely pristine, and some other time there will be a perfectly clean input which outputs complete garbage. Maybe this is a testament to the sheer difficulty of computer vision or the incredible and underappreciated abilities of the human visual cortex.
At one point, I was talking to someone and I distinctly remembered (I know, all the best stories start this way) a sense of surprise when the person indicated that he had heard of Tesseract, the open source OCR engine. I had appraised it as somewhat more obscure than it evidently was. Some time later, I confided about the incident with a friend, and he said something along the lines of "OCR is one of those fields that everyone comes across once".
I guess I've kind of held onto that thought for a while now, and it certainly seems to have at least a grain of truth. Text embedded into the physical world is more or less our primary means we have for communication and expression. Technology is about building tools that augment human capacity and inevitably entails supplanting some human capability. Data input is a huge bottleneck, and while we're kind of sidestepping the problem with things like QR codes by bringing the digital world into the physical. OCR is just one of those fundamental enabling technologies which ought to be as broad in scope as the set of humans who have interacted with a keyboard.
I can't help but feel that the rather large set of people who have interacted with the problem character recognition have surveyed the available tools and reached the same conclusion as your miniature Magic 8 Ball desk ornament: "Try again later". It doesn't take long for one to discover an instance of perfectly crisp and legible type which results in line noise of such entropy that it'd give DUAL_EC_DRBG a run for its money. "No, there really isn't any way for this to be the state of the art." "Well, I guess if it is, then maybe it'll improve in a few years- technology improves quickly, right?"
You would think that some analogue of Linus's Law would hold true: "given enough eyeballs, all bugs are shallow"- especially if you're dealing with literal eyeballs reading letters. But incidentally, the engine that absolutely everyone uses was developed three decades ago (It's older than I am!), abandoned for a decade before being acquired and released to the world (by our favorite benevolent overlords, Google).
In fact, what's absolutely stunning is the sheer universality of Tesseract. Just about everything which claims to have text recognition as a feature is backed by it. At one point, I was hoping that Mathematica had some clever routine using morphology and symbolic new kinds of sciences and evolved automata pattern recognition. Nope! Nestled deep within the gigabytes of code lies the Chuck Testa of textadermies: Tesseract.
没有合适的资源?快使用搜索试试~ 我知道了~
OCR识别-基于Javascript+Emscripten实现的OCR识别算法-附项目源码-优质项目实战.zip
共97个文件
cc:29个
h:21个
html:7个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 114 浏览量
2024-05-14
11:20:53
上传
评论
收藏 7.14MB ZIP 举报
温馨提示
OCR识别_基于Javascript+Emscripten实现的OCR识别算法_附项目源码_优质项目实战
资源推荐
资源详情
资源评论
收起资源包目录
OCR识别_基于Javascript+Emscripten实现的OCR识别算法_附项目源码_优质项目实战.zip (97个子文件)
OCR识别_基于Javascript+Emscripten实现的OCR识别算法_附项目源码_优质项目实战
ocrad-0.25
textline.h 2KB
track.h 3KB
ucs.cc 12KB
configure 6KB
ChangeLog 8KB
testsuite
test.ouf 413B
test.pbm 54KB
check.sh 6KB
test.txt 176B
test_utf8.txt 238B
profile.cc 18KB
README 2KB
character_r12.cc 9KB
page_image.h 3KB
track.cc 10KB
textline_r2.cc 30KB
mask.h 2KB
ocradlib.cc.orig 9KB
doc
ocrad.1 3KB
ocrad.texi 25KB
ocrad.info 26KB
NEWS 843B
arg_parser.h 4KB
Makefile 7KB
AUTHORS 209B
rectangle.cc 7KB
character.cc 12KB
config.status 273B
textpage.h 1KB
blob.h 2KB
ocradlib.cc 9KB
textblock.h 2KB
feats_test1.cc 9KB
user_filter.h 2KB
segment.cc 2KB
bitmap.h 3KB
page_image.cc 20KB
iso_8859.cc 3KB
ocradlib.h 4KB
rectangle.h 3KB
rational.cc 7KB
ocradlib.h.orig 4KB
textline.cc 11KB
histogram.h 2KB
common.h 3KB
user_filter.cc 8KB
profile.h 2KB
feats.h 3KB
character_r11.cc 14KB
Makefile.in 6KB
common.cc 6KB
page_image_io.cc 10KB
INSTALL 2KB
ocrad.png 219B
bitmap.cc 13KB
character_r13.cc 2KB
configure.orig 6KB
ocradcheck.cc 3KB
segment.h 2KB
character.h 3KB
textpage.cc 17KB
ocradlib.cc.rej 1KB
configure.rej 2KB
textblock.cc 18KB
arg_parser.cc 6KB
mask.cc 4KB
ocradlib.h.rej 630B
blob.cc 8KB
feats_test0.cc 33KB
feats.cc 11KB
main.cc 16KB
COPYING 18KB
rational.h 6KB
ucs.h 7KB
iso_8859.h 1KB
worker.js 81B
src
diff.patch 3KB
pre.js 58B
generate.py 556B
post.js 11KB
examples
browser
numbers.html 1KB
simple.html 1KB
location.html 1KB
img
message.png 23KB
numbers.png 61KB
StarWars.mp4 5.93MB
hask.jpg 302KB
url.html 1KB
file.html 2KB
webcam.html 3KB
nodejs
app.js 433B
test.png 99KB
package.json 509B
demo.html 19KB
README.md 6KB
build.sh 1KB
ocrad.js 3.06MB
共 97 条
- 1
资源评论
__AtYou__
- 粉丝: 1644
- 资源: 507
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于Javascript和Vue的微信小程序抽奖打地鼠游戏设计源码 - 抽奖打地鼠
- 基于Python和Javascript的车展大屏演示前后端web应用设计源码 - autoshow
- 基于Javascript和微信小程序的Anna设计源码
- 基于Java的仿制品设计源码 - bilibili
- 基于Javascript的影视动画设计源码 - cad
- 基于Java和深度学习的瓦斯浓度预测系统后端设计源码 - 瓦斯浓度预测后端
- Screenshot_20240528_103010.jpg
- 基于Python的新能源承载力计算及界面设计源码 - HAINING-DG
- 基于Java的本科探索学习项目设计源码 - 本科探索
- 基于Javascript和Python的微商城项目设计源码 - MicroMall
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功