# dlib-models
This repository contains trained models created by me (Davis King). They are provided as part of the dlib example programs, which are intended to be educational documents that explain how to use various parts of the dlib library. As far as I am concerned, anyone can do whatever they want with these model files as I've released them into the public domain. Details describing how each model was created are summarized below.
## dlib_face_recognition_resnet_model_v1.dat.bz2
This model is a ResNet network with 29 conv layers. It's essentially a version of the ResNet-34 network from the paper Deep Residual Learning for Image Recognition by He, Zhang, Ren, and Sun with a few layers removed and the number of filters per layer reduced by half.
The network was trained from scratch on a dataset of about 3 million faces. This dataset is derived from a number of datasets. The face scrub dataset (http://vintage.winklerbros.net/facescrub.html), the VGG dataset (http://www.robots.ox.ac.uk/~vgg/data/vgg_face/), and then a large number of images I scraped from the internet. I tried as best I could to clean up the dataset by removing labeling errors, which meant filtering out a lot of stuff from VGG. I did this by repeatedly training a face recognition CNN and then using graph clustering methods and a lot of manual review to clean up the dataset. In the end about half the images are from VGG and face scrub. Also, the total number of individual identities in the dataset is 7485. I made sure to avoid overlap with identities in LFW.
The network training started with randomly initialized weights and used a structured metric loss that tries to project all the identities into non-overlapping balls of radius 0.6. The loss is basically a type of pair-wise hinge loss that runs over all pairs in a mini-batch and includes hard-negative mining at the mini-batch level.
The resulting model obtains a mean error of 0.993833 with a standard deviation of 0.00272732 on the LFW benchmark.
## mmod_dog_hipsterizer.dat.bz2
This dataset is trained on the data from the Columbia Dogs dataset, which was introduced in the paper:
Dog Breed Classification Using Part Localization
Jiongxin Liu, Angjoo Kanazawa, Peter Belhumeur, David W. Jacobs
European Conference on Computer Vision (ECCV), Oct. 2012.
The original dataset is not fully annotated. So I created a new fully annotated version which is available here: http://dlib.net/files/data/CU_dogs_fully_labeled.tar.gz
## mmod_human_face_detector.dat.bz2
This is trained on this dataset: http://dlib.net/files/data/dlib_face_detection_dataset-2016-09-30.tar.gz.
I created the dataset by finding face images in many publicly available
image datasets (excluding the FDDB dataset). In particular, there are images
from ImageNet, AFLW, Pascal VOC, the VGG dataset, WIDER, and face scrub.
All the annotations in the dataset were created by me using dlib's imglab tool.
## resnet34_1000_imagenet_classifier.dnn.bz2
This is trained on the venerable ImageNet dataset.
## shape_predictor_5_face_landmarks.dat.bz2
This is a 5 point landmarking model which identifies the corners of the eyes and bottom of the nose. It is
trained on the [dlib 5-point face landmark dataset](http://dlib.net/files/data/dlib_faces_5points.tar), which consists of
7198 faces. I created this dataset by downloading images from the internet and annotating them with dlib's imglab tool.
The exact program that produced the model file can be found [here](https://github.com/davisking/dlib/blob/master/tools/archive/train_face_5point_model.cpp).
This model is designed to work well with dlib's HOG face detector and the CNN face detector (the one in mmod_human_face_detector.dat).
## shape_predictor_68_face_landmarks.dat.bz2
This is trained on the ibug 300-W dataset (https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/)
C. Sagonas, E. Antonakos, G, Tzimiropoulos, S. Zafeiriou, M. Pantic.
300 faces In-the-wild challenge: Database and results.
Image and Vision Computing (IMAVIS), Special Issue on Facial Landmark Localisation "In-The-Wild". 2016.
The license for this dataset excludes commercial use and Stefanos Zafeiriou,
one of the creators of the dataset, asked me to include a note here saying
that the trained model therefore can't be used in a commerical product. So
you should contact a lawyer or talk to Imperial College London to find out
if it's OK for you to use this model in a commercial product.
Also note that this model file is designed for use with dlib's HOG face detector. That is, it expects the bounding
boxes from the face detector to be aligned a certain way, the way dlib's HOG face detector does it. It won't work
as well when used with a face detector that produces differently aligned boxes, such as the CNN based mmod_human_face_detector.dat face detector.
## shape_predictor_68_face_landmarks_GTX.dat.bz2
The GTX model is the result of applying a set of training strategies and implementation optimization described in:
Alvarez Casado, C., Bordallo Lopez, M.
Real-time face alignment: evaluation methods, training strategies and implementation optimization.
Springer Journal of Real-time image processing, 2021
The resulted model is smaller, faster, smoother and more accurate. You can find all the details related to
the training and testing in the next Gitlab repository: https://gitlab.com/visualhealth/vhpapers/real-time-facealignment
This is trained on the ibug 300-W dataset (https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/)
C. Sagonas, E. Antonakos, G, Tzimiropoulos, S. Zafeiriou, M. Pantic.
300 faces In-the-wild challenge: Database and results.
Image and Vision Computing (IMAVIS), Special Issue on Facial Landmark Localisation "In-The-Wild". 2016.
The license for this dataset excludes commercial use and Stefanos Zafeiriou,
one of the creators of the dataset, asked me to include a note here saying
that the trained model therefore can't be used in a commerical product. So
you should contact a lawyer or talk to Imperial College London to find out
if it's OK for you to use this model in a commercial product.
Also note that this model file with increased robustness to face detectors. However, it works best when the bounding boxes are squared,
as it is que case with both dlib's HOG face detector or the CNN based mmod_human_face_detector.dat face detector. It won't work as well
when used with other face detectors that produce rectangular boxes.
## mmod_rear_end_vehicle_detector.dat.bz2
This model is trained on the [dlib rear end vehicles dataset](http://dlib.net/files/data/dlib_rear_end_vehicles_v1.tar). The dataset contains images from vehicle dashcams which I manually annotated using dlib's imglab tool.
## mmod_front_and_rear_end_vehicle_detector.dat.bz2
This model is trained on the [dlib front and rear end vehicles dataset](http://dlib.net/files/data/dlib_front_and_rear_vehicles_v1.tar). The dataset contains images from vehicle dashcams which I manually annotated using dlib's imglab tool.
## dnn_gender_classifier_v1.dat.bz2
This model is a gender classifier trained using a private dataset of about 200k different face images and was generated according to the network definition and settings given in [Minimalistic CNN-based ensemble model for gender prediction from face images](http://www.eurecom.fr/fr/publication/4768/download/mm-publi-4768.pdf). Even if the dataset used for the training is different from that used by G. Antipov et al, the classification results on the LFW evaluation are similar overall (± 97.3%). To take up the authors' proposal to join the results of three networks, a simplification was made by finally presenting RGB images, thus simulating three "grayscale" networks via the three image planes. Better results could be probably obtained with a more complex and deeper network, bu
没有合适的资源?快使用搜索试试~ 我知道了~
资源详情
资源评论
资源推荐
收起资源包目录
face.rar (28个子文件)
face
stage1.py 2KB
stage3.py 11KB
data
model
shape_predictor_68_face_landmarks.dat 95.08MB
dlib_face_recognition_resnet_model_v1.dat 21.43MB
dlib-models-master
mmod_dog_hipsterizer.dat.bz2 17.35MB
dlib_face_recognition_resnet_model_v1.dat.bz2 20.44MB
shape_predictor_68_face_landmarks.dat.bz2 61.07MB
shape_predictor_5_face_landmarks.dat.bz2 5.44MB
gender-classifier
dnn_gender_classifier_v1.dat.bz2 504KB
dnn_gender_classifier_v1_ex.cpp 9KB
LICENSE 6KB
mmod_human_face_detector.dat.bz2 678KB
mmod_rear_end_vehicle_detector.dat.bz2 3.56MB
age-predictor
dnn_age_predictor_v1.dat.bz2 9.79MB
dnn_age_predictor_v1_ex.cpp 7KB
resnet34_1000_imagenet_classifier.dnn.bz2 78.53MB
mmod_front_and_rear_end_vehicle_detector.dat.bz2 3.69MB
shape_predictor_68_face_landmarks_GTX.dat.bz2 40.48MB
README.md 9KB
resnet50_1000_imagenet_classifier.dnn.bz2 83.21MB
feature
.idea
misc.xml 305B
modules.xml 275B
workspace.xml 12KB
.gitignore 184B
inspectionProfiles
Project_Default.xml 2KB
profiles_settings.xml 174B
renlian5.iml 335B
stage2.py 4KB
共 28 条
- 1
Hulk_liu
- 粉丝: 9
- 资源: 3
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0