### Amazon验证码识别
在破解Amazon的验证码的时候,利用机器学习得到验证码破解精度超过70%,主要是训练样本不够,如果在足够的样本下达到90%是非常有可能的。
update后,样本数为2800多,破解精度达到90%以上,perfect!
### 文档结构为
```
-- iconset1
-- ...
-- jpg
-- img
-- jpg
-- ...
-- error.txt
-- py
-- crack.py
```
### 需要的库
`pip3 install pillow` or `easy_install Pillow`
### 必须文件下载地址
[Amazon验证码识别](https://github.com/TTyb/AmazonCaptcha)
> 1.读取图片,打印图片的结构直方图
遍历出所有的jpg文件,
```
import os
# 找出文件夹下所有xml后缀的文件
def listfiles(rootdir, prefix='.xml'):
file = []
for parent, dirnames, filenames in os.walk(rootdir):
if parent == rootdir:
for filename in filenames:
if filename.endswith(prefix):
file.append(rootdir + filename)
return file
else:
pass
if __name__ == '__main__':
path = "../jpg/img/"
jpgname = listfiles(path, "jpg")
```
jpgname为一个数组,将文件夹中的jpg文件全部遍历出来
```
['../jpg/img/056567f5e15f8d5f46bc5e07905009fd.jpg', '../jpg/img/05796993cf0a3c779b6fe83db2a27ac3.jpg', '../jpg/img/073847b62252c63829850cb1bd49601e.jpg', '../jpg/img/07aafc4694264509135490b85630aaf5.jpg', '../jpg/img/07d126e49e42143e0d21a0dafd522ac8.jpg', '../jpg/img/07dbfd0bd41d11e9475a96bc724e9f56.jpg', '../jpg/img/07fb8e7163e2ebd36e90c209502051ed.jpg', '../jpg/img/08ff7dc78f348ad7e4309eda9588a5f5.jpg', '../jpg/img/09dc3340f3c4a77c61cd18da7b3eca82.jpg', '../jpg/img/0b354ba9e9a132075fcc3dff6f517106.jpg', '../jpg/img/0bdca69fec2089cfaa46b458f5e483c3.jpg', '../jpg/img/0d0b1d778e00a1c84001d5838b9f5ef1.jpg', '../jpg/img/0d14f8838c30f6b54f266d9eb02e1b93.jpg', '../jpg/img/0e8d3e12d36d39314acfcd3bb8c3970a.jpg',...]
```
读取图片,得到图片的结构直方图
```
from PIL import Image
for item in jpgname:
newjpgname = []
im = Image.open(item)
print(item)
# jpg不是最低像素,gif才是,所以要转换像素
im = im.convert("P")
# 打印像素直方图
his = im.histogram()
```
像素直方图打印结果为
`[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 2, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 1, 2, 0, 1, 0, 0, 1, 0, 2, 0, 0, 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 3, 1, 3, 3, 0, 0, 0, 0, 0, 0, 1, 0, 3, 2, 132, 1, 1, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 15, 0, 1, 0, 1, 0, 0, 8, 1, 0, 0, 0, 0, 1, 6, 0, 2, 0, 0, 0, 0, 18, 1, 1, 1, 1, 1, 2, 365, 115, 0, 1, 0, 0, 0, 135, 186, 0, 0, 1, 0, 0, 0, 116, 3, 0, 0, 0, 0, 0, 21, 1, 1, 0, 0, 0, 2, 10, 2, 0, 0, 0, 0, 2, 10, 0, 0, 0, 0, 1, 0, 625]`
该数组长度为255,每一个元素代表(0-255)颜色的多少,例如最后一个元素为625,即255(代表的是白色)最多,组合在一起
```
values = {}
for i in range(0, 256):
values[i] = his[i]
# 排序,x:x[1]是按照括号内第二个字段进行排序,x:x[0]是按照第一个字段
temp = sorted(values.items(), key=lambda x: x[1], reverse=True)
# print(temp)
```
打印结果为
`[(255, 625), (212, 365), (220, 186), (219, 135), (169, 132), (227, 116), (213, 115), (234, 21), (205, 18), (184, 15), (241, 10), (248, 10), (191, 8), (198, 6), (155, 3), (157, 3), (158, 3), (167, 3), (228, 3), (56, 2), (67, 2), (91, 2), (96, 2), (109, 2), (122, 2), (127, 2), (134, 2), (140, 2), (168, 2), (176, 2), (200, 2), (211, 2), (240, 2), (242, 2), (247, 2), (43, 1), (44, 1), (53, 1), (61, 1), (68, 1), (79, 1), (84, 1), (92, 1), (101, 1), (103, 1), (104, 1), (107, 1), (121, 1), (126, 1), (129, 1), (132, 1), (137, 1), (149, 1), (151, 1), (153, 1), (156, 1), (165, 1), (170, 1), (171, 1), (175, 1), (186, 1), (188, 1), (192, 1), (197, 1), (206, 1), (207, 1), (208, 1), (209, 1), (210, 1), (215, 1), (223, 1), (235, 1), (236, 1), (253, 1), (0, 0), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0), (10, 0), (11, 0), (12, 0), (13, 0), (14, 0), (15, 0), (16, 0), (17, 0), (18, 0), (19, 0), (20, 0), (21, 0), (22, 0), (23, 0), (24, 0), (25, 0), (26, 0), (27, 0), (28, 0), (29, 0), (30, 0), (31, 0), (32, 0), (33, 0), (34, 0), (35, 0), (36, 0), (37, 0), (38, 0), (39, 0), (40, 0), (41, 0), (42, 0), (45, 0), (46, 0), (47, 0), (48, 0), (49, 0), (50, 0), (51, 0), (52, 0), (54, 0), (55, 0), (57, 0), (58, 0), (59, 0), (60, 0), (62, 0), (63, 0), (64, 0), (65, 0), (66, 0), (69, 0), (70, 0), (71, 0), (72, 0), (73, 0), (74, 0), (75, 0), (76, 0), (77, 0), (78, 0), (80, 0), (81, 0), (82, 0), (83, 0), (85, 0), (86, 0), (87, 0), (88, 0), (89, 0), (90, 0), (93, 0), (94, 0), (95, 0), (97, 0), (98, 0), (99, 0), (100, 0), (102, 0), (105, 0), (106, 0), (108, 0), (110, 0), (111, 0), (112, 0), (113, 0), (114, 0), (115, 0), (116, 0), (117, 0), (118, 0), (119, 0), (120, 0), (123, 0), (124, 0), (125, 0), (128, 0), (130, 0), (131, 0), (133, 0), (135, 0), (136, 0), (138, 0), (139, 0), (141, 0), (142, 0), (143, 0), (144, 0), (145, 0), (146, 0), (147, 0), (148, 0), (150, 0), (152, 0), (154, 0), (159, 0), (160, 0), (161, 0), (162, 0), (163, 0), (164, 0), (166, 0), (172, 0), (173, 0), (174, 0), (177, 0), (178, 0), (179, 0), (180, 0), (181, 0), (182, 0), (183, 0), (185, 0), (187, 0), (189, 0), (190, 0), (193, 0), (194, 0), (195, 0), (196, 0), (199, 0), (201, 0), (202, 0), (203, 0), (204, 0), (214, 0), (216, 0), (217, 0), (218, 0), (221, 0), (222, 0), (224, 0), (225, 0), (226, 0), (229, 0), (230, 0), (231, 0), (232, 0), (233, 0), (237, 0), (238, 0), (239, 0), (243, 0), (244, 0), (245, 0), (246, 0), (249, 0), (250, 0), (251, 0), (252, 0), (254, 0)]`
将占比最多的10个颜色筛选出来
```
# 占比最多的10种颜色
for j, k in temp[:10]:
print(j, k)
# 255 12177
# 0 772
# 254 94
# 1 40
# 245 10
# 12 9
# 236 9
# 243 9
# 2 8
# 6 8
# 255是白底,0是黑色,可以打印来看看0和254
```
> 2.构造新的无杂质图片
生成一张白底啥都没有的图片
```
# 获取图片大小,生成一张白底255的图片
im2 = Image.new("P", im.size, 255)
```
利用上一步占比最多的颜色可以看出,255是白底,0是黑色,可以打印来看看0和254
![](http://images2015.cnblogs.com/blog/996148/201612/996148-20161210180057460-992618058.jpg)
最后证明0是黑色字母,254是斑点,可以舍弃!
将这些颜色根据宽和高的坐标以此写入新生成的白底照片中
```
# 获取图片大小,生成一张白底255的图片
im2 = Image.new("P", im.size, 255)
for y in range(im.size[1]):
# 获得y坐标
for x in range(im.size[0]):
# 获得坐标(x,y)的RGB值
pix = im.getpixel((x, y))
# 这些是要得到的数字
# 事实证明只要0就行,254是斑点
if pix == 0:
# 将黑色0填充到im2中
im2.putpixel((x, y), 0)
# 生成了一张黑白二值照片
# im2.show()
```
`黑白二值照片`
![](http://images2015.cnblogs.com/blog/996148/201612/996148-20161210180352226-30674119.png)
> 3.切割图片
**x代表图片的宽,y代表图片的高**
对图片进行纵向切割
```
# 纵向切割
# 找到切割的起始和结束的横坐标
inletter = False
foundletter = False
start = 0
end = 0
letters = []
for x in range(im2.size[0]):
for y in range(im2.size[1]):
pix = im2.getpixel((x, y))
if pix != 255:
inletter = True
if foundletter == False and inletter == True:
foundletter = True
start = x
if foundletter == True and inletter == False:
foundletter = False
end = x
letters.append((start, e
没有合适的资源?快使用搜索试试~ 我知道了~
Amazon验证码机器学习破解
共2000个文件
gif:1995个
py:3个
txt:1个
需积分: 1 0 下载量 109 浏览量
2024-08-12
11:19:30
上传
评论
收藏 5.95MB ZIP 举报
温馨提示
Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazon验证码机器学习破解Amazo
资源推荐
资源详情
资源评论
收起资源包目录
Amazon验证码机器学习破解 (2000个子文件)
201612101453247834.gif 796B
201612101453116880.gif 795B
20161210145230696.gif 794B
201612101452196059.gif 793B
201612101452337045.gif 793B
201612101452184477.gif 793B
201612101452209470.gif 793B
201612101452112788.gif 793B
201612101452453180.gif 791B
20161210145208486.gif 791B
201612101452253321.gif 790B
20161210145323368.gif 789B
201612101452496336.gif 789B
201612101452087825.gif 789B
201612101453013619.gif 788B
201612101453327849.gif 788B
201612101452287065.gif 788B
201612101452348623.gif 788B
201612101453288142.gif 788B
201612101452302481.gif 787B
201612101452381378.gif 787B
201612101453241162.gif 786B
201612101453215711.gif 786B
201612101452503079.gif 786B
201612101453081814.gif 786B
201612101452283593.gif 785B
201612101452586142.gif 785B
201612101452427722.gif 784B
201612101453104845.gif 784B
20161210145231593.gif 784B
201612101453257947.gif 782B
201612101452505012.gif 782B
20161210145232827.gif 781B
201612101452193093.gif 781B
201612101453002043.gif 781B
201612101453208939.gif 781B
201612101452189194.gif 781B
201612101452135280.gif 781B
201612101453241937.gif 781B
201612101452091624.gif 781B
20161210145211925.gif 781B
201612101453018422.gif 780B
201612101452156499.gif 780B
201612101453096421.gif 780B
201612101453002320.gif 779B
201612101452414409.gif 779B
201612101452119938.gif 779B
201612101452544761.gif 779B
201612101453281319.gif 779B
201612101453172190.gif 779B
201612101452368977.gif 779B
201612101453053461.gif 778B
201612101452235210.gif 778B
201612101452432905.gif 777B
201612101453009491.gif 777B
201612101452269171.gif 777B
201612101453237846.gif 777B
201612101452232981.gif 776B
201612101453256957.gif 776B
201612101452426378.gif 776B
20161210145309116.gif 776B
201612101452501025.gif 776B
201612101452227607.gif 776B
201612101452234845.gif 775B
201612101452317063.gif 775B
201612101453138808.gif 775B
201612101452218987.gif 775B
201612101452099008.gif 775B
201612101453186938.gif 773B
20161210145253316.gif 773B
201612101453249795.gif 773B
201612101453041112.gif 773B
201612101452214760.gif 773B
201612101452231362.gif 773B
201612101452462372.gif 773B
201612101452316510.gif 773B
201612101453234507.gif 773B
201612101452499216.gif 773B
201612101453014351.gif 773B
201612101453091980.gif 773B
201612101453256564.gif 773B
201612101452303408.gif 773B
201612101452157590.gif 773B
201612101453295753.gif 773B
201612101452448228.gif 773B
20161210145254329.gif 773B
201612101453306870.gif 773B
201612101452599966.gif 773B
201612101453033605.gif 773B
20161210145246158.gif 773B
201612101453183915.gif 773B
201612101452309375.gif 773B
201612101453056146.gif 773B
201612101452203311.gif 773B
201612101452496489.gif 773B
201612101452312669.gif 773B
201612101453111479.gif 773B
201612101452087621.gif 773B
201612101452514104.gif 773B
201612101453218655.gif 773B
共 2000 条
- 1
- 2
- 3
- 4
- 5
- 6
- 20
资源评论
丘比特惩罚陆
- 粉丝: 8023
- 资源: 114
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功