人脸识别+数据采集+人脸特征提取（哈哈哈，绝决子）dlib库，shape全部代码

共28个文件

bz2：12个

xml：5个

py：3个

python

人脸识别

人脸检测

需积分: 5 157 浏览量 2022-04-15 10:32:38 上传评论 1 收藏 413.06MB RAR 举报

在IT领域，人脸识别是一项重要的计算机视觉技术，广泛应用于安全、监控、身份验证等多个场景。本教程将基于Python语言，利用dlib库进行人脸识别、数据采集以及人脸特征提取，非常适合初学者入门。 dlib是一个强大的C++库，它包含了丰富的机器学习算法，包括支持向量机（SVM）、决策树等，同时也提供了Python接口。在人脸识别方面，dlib提供了一个高效且准确的HOG（Histogram of Oriented Gradients）面部检测器，它能快速定位图像中的人脸。要开始人脸识别，首先需要安装dlib库。在Python环境中，可以使用pip来安装： ``` pip install dlib ``` 接着，我们需要引入必要的库，包括dlib、PIL（Python Imaging Library）用于图像处理，以及numpy进行数值计算： ```python import dlib from PIL import Image import numpy as np ``` 在数据采集阶段，通常需要收集大量的人脸图像以训练模型。你可以从网上找一些公开的人脸数据库，如LFW（Labeled Faces in the Wild），或者自己拍摄照片。在获取图像后，使用dlib的面部检测器来定位并裁剪出人脸部分。以下是一个简单的例子： ```python detector = dlib.get_frontal_face_detector() img = Image.open('image.jpg') faces = detector(img) for face in faces: # 裁剪人脸区域 x, y, w, h = face.left(), face.top(), face.width(), face.height() cropped_face = img.crop((x, y, x+w, y+h)) # 保存裁剪后的图像 cropped_face.save('cropped_face.jpg') ``` 接下来是人脸特征提取。dlib提供了一个预训练的68点面部地标模型，可以定位面部的关键点，如眼睛、鼻子和嘴。这一步可以用来提取人脸的形状信息，进一步用于特征表示： ```python predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat') # 需要下载此模型文件 for face in faces: landmarks = predictor(img, face) # 提取68个关键点坐标 points = np.array([[p.x, p.y] for p in landmarks.parts()]) # 处理关键点数据，例如进行PCA降维 ``` 有了这些特征，可以进行进一步的人脸识别任务，比如建立人脸识别模型。一种常见方法是使用深度学习，如卷积神经网络（CNN）。你可以构建一个简单的CNN模型，如VGGFace或FaceNet，并用收集到的数据集进行训练。训练完成后，模型就能对新的人脸图像进行识别。在你提到的21个人的人脸数据上进行实验，应该能够得到初步的识别效果。但需要注意的是，为了获得更好的识别性能，通常需要更大的数据集和更复杂的模型。此外，还要考虑光照、角度、表情等因素对识别的影响，以及如何处理遮挡、模糊等特殊情况。通过dlib库，我们可以实现从数据采集、人脸检测到特征提取的完整流程，为后续的人脸识别任务打下基础。对于初学者来说，这是一个很好的起点，可以逐步深入理解人脸识别技术的工作原理和应用。

资源详情

资源评论

资源推荐

收起资源包目录

face.rar （28个子文件）

face

stage1.py 2KB

stage3.py 11KB

data

model

shape_predictor_68_face_landmarks.dat 95.08MB

dlib_face_recognition_resnet_model_v1.dat 21.43MB

dlib-models-master

mmod_dog_hipsterizer.dat.bz2 17.35MB

dlib_face_recognition_resnet_model_v1.dat.bz2 20.44MB

shape_predictor_68_face_landmarks.dat.bz2 61.07MB

shape_predictor_5_face_landmarks.dat.bz2 5.44MB

gender-classifier

dnn_gender_classifier_v1.dat.bz2 504KB

dnn_gender_classifier_v1_ex.cpp 9KB

LICENSE 6KB

mmod_human_face_detector.dat.bz2 678KB

mmod_rear_end_vehicle_detector.dat.bz2 3.56MB

age-predictor

dnn_age_predictor_v1.dat.bz2 9.79MB

dnn_age_predictor_v1_ex.cpp 7KB

resnet34_1000_imagenet_classifier.dnn.bz2 78.53MB

mmod_front_and_rear_end_vehicle_detector.dat.bz2 3.69MB

shape_predictor_68_face_landmarks_GTX.dat.bz2 40.48MB

README.md 9KB

resnet50_1000_imagenet_classifier.dnn.bz2 83.21MB

feature

.idea

misc.xml 305B

modules.xml 275B

workspace.xml 12KB

.gitignore 184B

inspectionProfiles

Project_Default.xml 2KB

profiles_settings.xml 174B

renlian5.iml 335B

stage2.py 4KB

# dlib-models This repository contains trained models created by me (Davis King). They are provided as part of the dlib example programs, which are intended to be educational documents that explain how to use various parts of the dlib library. As far as I am concerned, anyone can do whatever they want with these model files as I've released them into the public domain. Details describing how each model was created are summarized below. ## dlib_face_recognition_resnet_model_v1.dat.bz2 This model is a ResNet network with 29 conv layers. It's essentially a version of the ResNet-34 network from the paper Deep Residual Learning for Image Recognition by He, Zhang, Ren, and Sun with a few layers removed and the number of filters per layer reduced by half. The network was trained from scratch on a dataset of about 3 million faces. This dataset is derived from a number of datasets. The face scrub dataset (http://vintage.winklerbros.net/facescrub.html), the VGG dataset (http://www.robots.ox.ac.uk/~vgg/data/vgg_face/), and then a large number of images I scraped from the internet. I tried as best I could to clean up the dataset by removing labeling errors, which meant filtering out a lot of stuff from VGG. I did this by repeatedly training a face recognition CNN and then using graph clustering methods and a lot of manual review to clean up the dataset. In the end about half the images are from VGG and face scrub. Also, the total number of individual identities in the dataset is 7485. I made sure to avoid overlap with identities in LFW. The network training started with randomly initialized weights and used a structured metric loss that tries to project all the identities into non-overlapping balls of radius 0.6. The loss is basically a type of pair-wise hinge loss that runs over all pairs in a mini-batch and includes hard-negative mining at the mini-batch level. The resulting model obtains a mean error of 0.993833 with a standard deviation of 0.00272732 on the LFW benchmark. ## mmod_dog_hipsterizer.dat.bz2 This dataset is trained on the data from the Columbia Dogs dataset, which was introduced in the paper: Dog Breed Classification Using Part Localization Jiongxin Liu, Angjoo Kanazawa, Peter Belhumeur, David W. Jacobs European Conference on Computer Vision (ECCV), Oct. 2012. The original dataset is not fully annotated. So I created a new fully annotated version which is available here: http://dlib.net/files/data/CU_dogs_fully_labeled.tar.gz ## mmod_human_face_detector.dat.bz2 This is trained on this dataset: http://dlib.net/files/data/dlib_face_detection_dataset-2016-09-30.tar.gz. I created the dataset by finding face images in many publicly available image datasets (excluding the FDDB dataset). In particular, there are images from ImageNet, AFLW, Pascal VOC, the VGG dataset, WIDER, and face scrub. All the annotations in the dataset were created by me using dlib's imglab tool. ## resnet34_1000_imagenet_classifier.dnn.bz2 This is trained on the venerable ImageNet dataset. ## shape_predictor_5_face_landmarks.dat.bz2 This is a 5 point landmarking model which identifies the corners of the eyes and bottom of the nose. It is trained on the [dlib 5-point face landmark dataset](http://dlib.net/files/data/dlib_faces_5points.tar), which consists of 7198 faces. I created this dataset by downloading images from the internet and annotating them with dlib's imglab tool. The exact program that produced the model file can be found [here](https://github.com/davisking/dlib/blob/master/tools/archive/train_face_5point_model.cpp). This model is designed to work well with dlib's HOG face detector and the CNN face detector (the one in mmod_human_face_detector.dat). ## shape_predictor_68_face_landmarks.dat.bz2 This is trained on the ibug 300-W dataset (https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/) C. Sagonas, E. Antonakos, G, Tzimiropoulos, S. Zafeiriou, M. Pantic. 300 faces In-the-wild challenge: Database and results. Image and Vision Computing (IMAVIS), Special Issue on Facial Landmark Localisation "In-The-Wild". 2016. The license for this dataset excludes commercial use and Stefanos Zafeiriou, one of the creators of the dataset, asked me to include a note here saying that the trained model therefore can't be used in a commerical product. So you should contact a lawyer or talk to Imperial College London to find out if it's OK for you to use this model in a commercial product. Also note that this model file is designed for use with dlib's HOG face detector. That is, it expects the bounding boxes from the face detector to be aligned a certain way, the way dlib's HOG face detector does it. It won't work as well when used with a face detector that produces differently aligned boxes, such as the CNN based mmod_human_face_detector.dat face detector. ## shape_predictor_68_face_landmarks_GTX.dat.bz2 The GTX model is the result of applying a set of training strategies and implementation optimization described in: Alvarez Casado, C., Bordallo Lopez, M. Real-time face alignment: evaluation methods, training strategies and implementation optimization. Springer Journal of Real-time image processing, 2021 The resulted model is smaller, faster, smoother and more accurate. You can find all the details related to the training and testing in the next Gitlab repository: https://gitlab.com/visualhealth/vhpapers/real-time-facealignment This is trained on the ibug 300-W dataset (https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/) C. Sagonas, E. Antonakos, G, Tzimiropoulos, S. Zafeiriou, M. Pantic. 300 faces In-the-wild challenge: Database and results. Image and Vision Computing (IMAVIS), Special Issue on Facial Landmark Localisation "In-The-Wild". 2016. The license for this dataset excludes commercial use and Stefanos Zafeiriou, one of the creators of the dataset, asked me to include a note here saying that the trained model therefore can't be used in a commerical product. So you should contact a lawyer or talk to Imperial College London to find out if it's OK for you to use this model in a commercial product. Also note that this model file with increased robustness to face detectors. However, it works best when the bounding boxes are squared, as it is que case with both dlib's HOG face detector or the CNN based mmod_human_face_detector.dat face detector. It won't work as well when used with other face detectors that produce rectangular boxes. ## mmod_rear_end_vehicle_detector.dat.bz2 This model is trained on the [dlib rear end vehicles dataset](http://dlib.net/files/data/dlib_rear_end_vehicles_v1.tar). The dataset contains images from vehicle dashcams which I manually annotated using dlib's imglab tool. ## mmod_front_and_rear_end_vehicle_detector.dat.bz2 This model is trained on the [dlib front and rear end vehicles dataset](http://dlib.net/files/data/dlib_front_and_rear_vehicles_v1.tar). The dataset contains images from vehicle dashcams which I manually annotated using dlib's imglab tool. ## dnn_gender_classifier_v1.dat.bz2 This model is a gender classifier trained using a private dataset of about 200k different face images and was generated according to the network definition and settings given in [Minimalistic CNN-based ensemble model for gender prediction from face images](http://www.eurecom.fr/fr/publication/4768/download/mm-publi-4768.pdf). Even if the dataset used for the training is different from that used by G. Antipov et al, the classification results on the LFW evaluation are similar overall (± 97.3%). To take up the authors' proposal to join the results of three networks, a simplification was made by finally presenting RGB images, thus simulating three "grayscale" networks via the three image planes. Better results could be probably obtained with a more complex and deeper network, bu